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Abstract 

Interprocedural flow analysis can be used to eliminate otherwise 
unnecessary heap allocated objects (unboxing), and in previous 
work we have shown how to do so while maintaining correctness 
with respect to the garbage collector In this paper, we extend the 
notion of flow analysis to incorporate types, enabling analysis and 
optimization of typed programs. We apply this typed analysis to 
specify a type preserving interprocedural unboxing optimization, 
and prove that the optimization preserves both type and GC safety 
along with program semantics. We also show that the unboxing 
optimization can be applied independently to separately compiled 
program modules, and prove via a contextual equivalence result 
that unboxing a module in isolation preserves program semantics. 

1. Introduction 

Many languages and compilers use a uniform object representation 
in which every source level object is represented at least initially 
by a heap allocated object. Such a representation allows polymor- 
phic functions to be compiled once and enables the implementation 
of features that use runtime type information. In this representa- 
tion machine integers and floating-point numbers are placed in a 
single-field object, a box, and this operation is called boxing. Op- 
erations such as addition require first projecting the number from 
the box (unboxing), followed by the actual addition, followed by 
the creation of a new box for the result (boxing). Boxing and un- 
boxing operations add considerable overhead, and thus it is highly 
desirable to remove them when possible - e.g. when polymorphism 
or features requiring runtime type information are not being used. 
We refer to the general class of optimizations that attempt to re- 
move unnecessary box and unbox operations as unboxing optimiza- 
tions. We refer to unboxing optimizations that attempt to eliminate 
boxing and unboxing across function boundaries as interprocedu- 
ral unboxing. We also include in this latter category optimizations 
(such as the one given in this paper) which attempt to unbox objects 
written to and read from other objects in the heap. 

Interprocedural unboxing presents additional challenges in a 
typed setting, since type information must be updated to reflect 
any unboxing. A box might flow to an argument in an application, 
and the parameter of the called function might flow to an unbox 
operation. If the optimization decides to remove the box and unbox 
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operations then it must also remove the box type on the parameter. 
In other words, typed unboxing requires not just rewriting uses and 
definitions in the traditional sense, but also rewriting intermediate 
points in the program through which the unboxed values flow. At 
a high level then, the optimization can be viewed as selecting a 
set of box operations, unbox operations, and box types to remove. 
Such a selection has a global consistency requirement — a box type 
should only be removed if all boxes that flow to it are removed, 
a box operation should only be removed if all unbox operations 
it flows to are removed, and so on. Thus choosing a set of boxed 
objects to eliminate and rewriting the program to reflect this choice 
in a consistent manner requires knowing what things flow to what 
points in the program, a question that flow analyses are designed 
to answer. In this paper we use the results of flow analysis to 
formulate correctness conditions for unboxing and then prove that 
those conditions ensure correct optimization. 

In previous work [8] we considered a simpler problem, that of 
rewriting garbage-collector (GC) metadata rather than full types. 
An accurate GC requires specifying for each field of each object 
and each slot of each stack activation frame whether it contains 
a pointer into the GC heap or not (contains a machine integer, 
floating-point number, etc.). As with types, when interprocedurally 
unboxing such metadata must be rewritten in a globally consistent 
manner. Our previous paper showed how to do this rewriting cor- 
rectly using the results of a flow analysis, in a whole program set- 
ting. In this paper we extend these ideas to develop a methodology 
for dealing with interprocedural optimization of statically typed 
languages (including universal polymorphism) in a type preserving 
fashion. We also show that this methodology does not depend on 
whole program compilation, and extends easily to support modular 
compilation. 

In the following sections, we begin by defining a core language 
with a polymorphic type system that has box and unbox opera- 
tions. As in our previous paper we formalize a notion of GC safety 
for our language — that the GC metadata is currently correct — and 
show that well-typed programs are GC safe throughout execution. 
Next we specify a set of abstract conditions that a reasonably flow 
analysis must satisfy, with the property that any flow analysis that 
satisfies these conditions can be used in our framework to optimize 
programs. The main section of the paper defines an unboxing op- 
timization parameterized over a choice of objects to unbox, and 
gives a set of correctness conditions under which such a choice 
is guaranteed to preserve typing and preserve semantics. We show 
that this set of correctness conditions is satisfiable by constructing a 
simple unboxing algorithm which satisfies these conditions. Finally 
we extend the system slightly by defining a notion of unboxing for 
modules and show that it is correct in the sense that a module is 
contextually equivalent to its unboxing. 

While our paper is specifically about the concrete optimization 
of unboxing, the ideas used here generalize naturally to other op- 
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timizations that change the representation of objects in a non-local 
fashion. For example, dead-field elimination and dead-parameter 
elimination impose similar requirements for rewriting types and 
GC metadata in a globally consistent fashion. Flow analyses can be 
used to specify and implement these (and others), and we believe 
(based on practical experience in our compiler) that the framework 
presented here extends naturally to such optimizations. As far as we 
know, this and our previous paper are the first to use a flow analysis 
to rewrite types and GC metadata in a globally consistent fashion, 
and to use a flow analysis to formulate correctness conditions for 
this rewriting process and prove these conditions sound. 

2. A type and GC safe core language 

Consider the following untyped program (using informal notation), 
where box denotes a boxing operation that wraps its argument in 
a heap-allocated structure, and unbox denotes its elimination form 
that projects out the boxed item from the box: 

let / = \x.(hoxx) inunbox(unbox(/ (box 3))) 

The only definition reaching the variable x is the boxed machine 
integer 3. Information from an interprocedural analysis can be used 
to rewrite this program to eliminate the boxing as follows: 

let / — Xx.x in/ 3 

This second version is much better in that it does less allocation, 
and executes fewer instructions. In this optimized version of the 
program however, an important property has changed that is not 
reflected in this untyped synatax. Specifically, the GC status of 
values reaching x has changed: whereas in the original program 
all values reaching x are represented as heap allocated pointers, 
in the second program all values reaching x are represented as 
machine integers. From the standpoint of a garbage collector, a 
garbage collection occuring while x is live must treat a; as a root 
in the first program, and must ignore x in the second program. 

The question of which variables should be treated as roots by 
the garbage collector is a subtle but crucial one for the purposes of 
optimization and compiler correctness. Consider a modification of 
the previous example in which the function / is used polymorphi- 
cally: 

let / — Aa;.(boxa;') inunbox(unbox((unbox(/ /)) (box 3))) 

In this variant, / is applied to itself and the boxed result (itself) 
is unboxed and applied to a boxed integer. The resulting doubly 
boxed integer is then unboxed. Assuming that functions are repre- 
sented as heap-allocated objects, each variable in this program has a 
concrete and statically known status as either a GC root or GC non- 
root, since all objects passed to / are heap references. However, 
an attempt to unbox this program as with the previous example re- 
sults in / being applied to both heap references (/) and non-heap 
references (3). 

let / = Xx.x in(/ /) 3 

Consequently, a correct optimizer must decline to unbox this pro- 
gram (at least in entirety) to avoid incorrect GC behavior ' . 

In our previous work[8] we developed a core language capturing 
the essential issues of GC safety, along with an analysis and opti- 
mization framework for reasoning about and optimizing GC safe 
programs in an untyped setting. This framework allows us to show 
that given a GC safe program, our unboxing optimization will al- 
ways produce a semantically equivalent GC safe program. How- 
ever, the framework is essentially limited to untyped programs and 



consequently it does not scale to typed core languages in which 
one must be able to check the well-typedness (and hence the GC 
safety) of programs before and after optimization[6]. In this paper, 
we intend to develop a methodology for addressing this style of 
optimization in a strongly typed setting. 

2.1 Type safety 

How does the problem of unboxing change in a typed setting? 
Consider again the first example from this section using a still 
informal but now typed notation: 

let 

/:box(int) — i> box(box(int)) — Ax:box(int).(box a;) 
inunbox(unbox(/(box 3))) 

As before, it is apparent that the only definition reaching the 
variable x is the boxed machine integer 3, and as before we can 
consider rewriting this program to eliminate (interprocedurally) 
the boxing. However, simply rewriting the terms of the program 
is inadequate from the standpoint of type preserving compilation, 
since the result is not well-typed: 



let 

/: box(int) 
in/ 3 



box(box(int)) = Aa;:box(int) 



The types of both the actual argument and the return value of / have 
changed, and are no longer consistent with the type annotation for / 
and X. In order to correctly unbox this program then, it is necessary 
to rewrite not just the terms, but also the types: 

let /:int — > int — Ax:int.a; in/ 3 

This requirement is a more substantial change than might at first 
be apparent. In the original (untyped) setting, it was sufficient to 
have information only about the direct definitions (boxes) and uses 
(unboxes) of objects. Rewriting the types in this fashion requires 
information not just about the uses and definitions, but also about 
intermediate program points (and other objects) through which the 
boxed objects flow. 

In addition to incurring these additional rewriting requirements, 
the typed setting must still account for GC safety. Consider again 
the polymorphic variant of the previous untyped example (naming 
the first application of / for clarity). 

let 

/:Va.Q — > box(a) = Aa.Ax:a.(boxa;) 
g:box(Va.a — >■ box(a)) = f\ia.a — > box(a)](/) 

in unbox (unbox ( (unbox (;)[box(int)] (box 3))) 

Here we have / applied to itself at a universal type to produce a 
boxed version of itself (g), which is then unboxed and applied to a 
boxed integer, at the boxed integer type. Attempting to unbox this 
example (rewriting types as necessary) immediately illuminates the 
problem. 

let 

fNa.a — )> Q = Aa.\x:a.x 
g-y/a.a -^ a — f[Wa.a — > a]{f) 

in g [int] (3) 

The function / is instantiated directly at a universal function type, 
and via its alias (g) at a machine integer type. As with the untyped 
example in the previous section, the compiler cannot assign a con- 
crete GC status to the variable x. For correctness then, the compiler 
must not (fully) unbox this example, and must leave at least the 
boxing operation on the integer parameter to /^. 



' A conservative GC, or a GC implementation which tags pointers to distin- 
guish them from non-pointers would not impose this restriction. See Sec- 
tion 2.2 for more discussion of the GC model. 



^ It is worth noting that an optimizing compiler might choose to duplicate 
the body of / to make it monomorphic, and hence allow it to be unboxed. 
It is also possible to use a runtime type passing interpretation to relax the 
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Figure 1. Syntax 

In the rest of this paper we make these issues concrete and for- 
mal, and we show how to deal with them issues by extending the 
notion of flow analysis to incorporate types, thereby generating the 
necessary flow information to correctly rewrite types and terms in 
a consistent fashion. While we focus on a concrete optimization 
(unboxing), we believe that these ideas are generally applicable to 
flow analysis based representation optimizations in typed interme- 
diate languages. 

2.2 A core language for GC safety 

In order to give a precise account of typed flow analysis and in- 
terprocedural unboxing, we begin by defining a type safe core lan- 
guage incorporating the essential features of GC safety. The mo- 
tivation for the (small) idiosyncracies of this language lies in the 
requirements of the underlying model of garbage collection. We as- 
sume that pointers cannot be intrinsically distinguished from non- 
pointers, and hence the compiler is required to statically annotate 
the program with garbage collection meta-data such that at any 
garbage collection point the garbage collector can reconstruct ex- 
actly which live variables are roots. Typically, this takes the form of 
annotations on variables and temporaries indicating which contain 
heap-pointers (the roots) and which do not (the non-roots), along 
with information at every allocation site indicating which fields of 
the allocated object contain traceable data. This approach is com- 
mon in modem systems, and it is this approach that we target in 
this paper. 

Figure 1 defines the syntax of our core language. The essence of 
the language is a standard polymorphic lambda calculus extended 
with a fix point operator, implemented via an explicit environment 
semantics. For the purposes of the semantics, we also include a 
form of degenerate type information we call traceabilities. Trace- 
abilities describe the GC status of variables: the traceability b (for 
bits) indicates something that should be ignored by the garbage 
collector, while the traceability r (for reference) indicates a GC- 
managed pointer. The traceability b is inhabited by an unspecified 
set of constants c while the traceability r is inhabited by functions 
(anticipating their implementation by heap-allocated closures) and 
by boxed objects. Anticipating the needs of the flow analysis, we 
label each type, term, value, and variable binding site with an in- 
teger label. We do not assume that labels or variables are unique 
within a program. 

Types (J consist of type variables, the base type of constants, 
B, function types, Vq.ti — )> T2, and boxed types box(r). In order 
to provide a concrete implementation strategy for the garbage col- 



constraints on the garbage collector sufficiently to permit this example[l]. 
These optimizations are orthogonal (but complementary) to the issues ad- 
dressed by this paper. 



lector, we insist that every type correspond to a traceability so that 
we can extract the necessary garbage collection meta-data. Types 
are mapped to traceabilities using the function tr{T), defined in 
Figure 2. Polymorphic functions are restricted by well-formedness 
rules to only be instantiated with types with the traceability r, and 
consequently tr{a) — r. We define substitution of types in the 
standard way and define T[a^ /a] — T[a/a]. 

Expressions e consist of labeled terms m' and labeled val- 
ues u\ The terms m consist of variables, functions, applica- 
tions, box introductions, box eliminations, and frames. Functions 
fix f[a]{x:Ti):T2.e are polymorphic and recursive and variable 
binding sites are decorated with types. We represent heap alloca- 
tion in the language via the boxr e term, which corresponds to 
allocating a heap cell containing the value for e. The type r is used 
by the dynamic semantics to provide the meta-data with which the 
heap-cell will be tagged, allowing the garbage collector to trace the 
cell. However, only the top-level traceability of the type (given by 
the tr{) function in Figure 2) is actually required by the dynamic 
semantics, and so the language can be erased into an untyped lan- 
guage in the obvious way. Objects can be projected out of an allo- 
cated object by the unbox e operation. Frames p(e) are discussed 
further below. 

Values consist of either constants, closures, or heap-allocated 
boxes. We distinguish between the introduction form (box^ e) and 
the value form ((ti*;r)) for allocated objects. The introduction form 
corresponds to the allocation instruction, whereas the value form 
corresponds to the allocated heap value. This distinction is key for 
the formulation of GC safety and the dynamic semantics. For the 
purposes of the dynamic semantics we also distinguish between 
functions (fix f[a]{x:Ti):T2 .e) and the heap allocated closures 
that represent them at runtime ((p, fix f[a]{x:Ti):r2 .e)). 

For notational convenience, we will sometimes use the notation 
vt, to indicate that a value w is a non-heap-allocated value (i.e. a 
constant c), and Vr to indicate that a value w is a heap-allocated 
value (i.e. either a function value or a boxed value). If i is a 
traceability meta-variable, then we use vt to indicate that « is a 
value of the same traceability as t. 

In examples, we use a derived let expression, taking it to be 
syntactic sugar for application in the usual manner. Environments 
p map variables to values. The term p(e) executes e in the environ- 
ment p rather than the outer environment - all of the free variables 
of e are provided by p. The nested set of these environments at any 
point can be thought of as the activation stack frames of the execut- 
ing program. The traceability of the typing annotations on variables 
in the environments play the role of stack-frame GC meta-data, in- 
dicating which slots of the frame are roots (traceability r). The en- 
vironments buried in closures ((p, fix f[a]{x:Ti):T2.e)) similarly 
provide the traceabilities of values reachable from the closure via 
the type annotations on the variables in the environment, and hence 
provide the GC meta-data for tracing through closures. While we 
do not make the process of garbage collection explicit, it should 
be clear how to extract the appropriate set of GC roots from the 
environment and any active frames. 

This core language contains the appropriate information to for- 
malize a notion of GC safety consisting of two complementing 
pieces. First we define a dynamic semantics in which reductions 
that might lead to undefined garbage-collector behavior are explic- 
itly undefined. Programs that take steps in this semantics do not 
introduce ill-formed heap objects. Secondly, we define a notion 
of a traceable program: one in which all heap values have valid 
GC meta-data. Reduction steps in the semantics can then be shown 
to maintain the traceability property in addition to the usual well- 
typedness property. The GC correctness criteria for a compiler op- 
timization then becomes simply the usual one: that the optimization 
map well-typed programs to semantically equivalent well-typed 
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programs. By showing that typable programs are both traceable 
programs and have well-defined semantics, we thereby show that 
GC correctness for a compiler optimization can be achieved simply 
by preserving well-typedness. 

It is worth noting that in our implementation, the compiler in- 
termediate language under consideration is substantially more low- 
level: a control-flow graph based, static single assignment interme- 
diate representation. We believe however that all of the key issues 
are captured faithfully in this higher-level representation, and with 
greater clarity of presentation. 
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2.3 Operational semantics 

We choose to use an explicit environment semantics rather than a 
standard substitution semantics since this makes the GC meta-data 
(implicit in the types) for stack frames and closures explicit in the 
semantics. Thus a machine state (p, e) supplies an environment p 
for e that provides the values of the free variables of e during ex- 
ecution. Environments contain typing annotations on each of the 
variables mapped by the environment, which provide the traceabil- 
ities of the variables. 

Reduction in this language is for the most part fairly standard. 
We deviate somewhat in that we explicitly model the allocation of 
heap objects as a reduction step — hence there is an explicit reduc- 
tion mapping a function term fix f[a\{x:Ti):T2.e to an allocated 
closure (p, fix /[a](x:ri):T2.e), and similarly for boxed objects 
and values. More notably, beta-reduction is restricted to only permit 
construction of a stack frame when the type for the parameter vari- 
able has an appropriate traceability for the actual argument value. 
This captures the requirement that stack frames have correct meta- 
data for the garbage collector. In actual practice, incorrect meta- 
data for stack frames leads to undefined behavior (since incorrect 
meta-data may cause arbitrary memory corruption by the garbage 
collector) — similarly here in the meta-theory we leave the behav- 
ior of such programs undefined. In a similar fashion, we only de- 
fine the reduction of the allocation operation to an allocated value 
(boxi- vt I — y {vt'-r}) when the operation meta-data is appropriate 
for the value (i.e. tr{T) — t). 

It is important to note that this semantics does not model a 
dynamically checked language, in which there is an explicit check 
of the meta-data associated with these reductions. The point is 
simply that the semantics only specifies how programs behave 
when these conditions are met — in all other cases the behavior of 
the program is undefined. 



2.4 Traceability 

The operational semantics ensures that no reduction step introduces 
mis-tagged values. In order to make use of this, we define a judg- 
ment for checking that a program does not have a mis-tagged value 
in the first place. Implicitly this judgement defines what a well- 
formed heap and activation stack looks like; however, since our 
heap and stack are implicit in our machine states, it takes the form 
of a judgement on terms, values, environments, and machine states. 
The value judgement hv v.t asserts that a value v is well- 
formed, and has traceability t. In this simple language, this cor- 
responds to having the types on the variables in the environment 
of each function value have traceabilities which are consistent with 
the values to which they are bound, and the type on each boxed 
value be consistent with the traceability of the object nested in the 
box. An environment is consistent, h p tr, when the annotation on 
each variable agrees with the traceability of the value it is bound to. 
The term judgement h e tr and machine state judgement h Af tr 
simply check that all values and environments (and hence stack 
frames) contained in the term or machine state are well-formed. 



(P, (fix f[a]{x:Ti):T2.e)^) i— > 
(p, (p,fix f[a]{x:ri):r2.ey) 



tr{T) = t 



(p,(box,^/)^)^(p,(«i^r)^) 

(p,ei) I — > (p,e'i) 
(p, (ei[r] 62)') I — > (p, (e'i[r] 62)') 

(p,e2) I — V (p,e2) 
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(p',e)i — >(p',e') 



(p,p'(e)')^(p,p'(eT) (apV))^(p,«') 

Figure 2. Operational Semantics 



The key result for traceability is that it is preserved under reduc- 
tion. That is, if a traceable term takes a well-defined reduction step, 
then the resulting term will be traceable. 

Lemma 1 (Preservation of traceability) 

If I- M tr and M 1 — > M' then h M' tr. 

Proof: If h (p, e) tr then h p tr and h e tr. If (p, e) 1 — ;• (p, e') 
then the result follows if we can show h e' tr. The proof of that is 
by induction on the derivation of (p, e) 1 — > (p, e'). Consider the 
cases for the last rule used to derive it (the cases are in the same 
order as in the figure): 

• In this case, e — x^ for some x and k, and e' = «-' where 
x:t = V-' £ p for some r, v, and j. Since h p tr then 
hv v:tr{T), so by the traceability rules h v-' tr as required. 
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Figure 3. Traceability 



• In this case, e = (fix f[a\{x:Ti):T2.e"Y for some /, a, 
X, Ti, T2, e", and j, and e' = (p,fix /[a](a;:ri):r2.e")^. 
The first hypothesis is that h (fix f[a\{x:T\):T2.e")'' tr. 
There is only one rule to derive this judgement and that 
rule requires that h fix f[a\(x:Ti):T2-e" tr, which in turn 
can only be derived by one rule that requires that h e" tr. 
Then, and since h p tr, by the rules for traceability, hv 
(p, fix /[a] (xiTi ):r2.e") :r, and by the traceability rules again 
h (p, fix f[a]{x:Ti):T2-e")'' tr, as we are required to prove. 

• In this case, e — (boxrWt')"' for some r, vt, i, and j, 
e' — {vt^:T)\ and tr{r) = t. The first hypothesis is that 
h (boxr Vt^y tr. There is only one rule to derive this judge- 
ment and that rule requires that h boxr vt' tr, which in turn 
can only be derived by one rule that requires that h Vt^ tr. 
There is only one rule to derive the latter judgement and it re- 
quires that hv vt'.t' for some t' . By inspection of the rules for 
value traceability, we see that t = t'. Since tr(T) — t — t', by 
the rules for traceability, hv {wt':r):r, and by the traceability 
rules again h (vt^-.r) tr, as we are required to prove. 



In this case, e — (ei[r] 62)* for some ei, r, e2, and i, e' = 
(e'l [r] €2)' for some ei, and (p, ei ) 1 — > (p, e'l ) is a subderiva- 
tion. The first hypothesis is that h (ei[r] 62)' tr. There is 
only one rule to derive this judgement and that rule requires 
that h ei [t] 32 tr, which in turn can only be derived by one 
rule that requires both h ei tr and h e2 tr. Thus, by the 
induction hypothesis, h e'l tr. Then, by the rules for trace- 
ability, h e'l [t] e2 tr, and by the traceability rules again, 
h (e'l [t] 62)* tr, as we are required to prove. 

In this case, e — {v^[t] €2) for some v, i, r, 62, and j, 
e' — (u'[r] e'2)'' for some 62, and (p, 62) 1 — > {p,^'2) is a 
subderivation. The first hypothesis is that h (w*[r] 62)"' tr. 
There is only one rule to derive this judgement and that rule 
requires that h v^ [r] e2 tr, which in turn can only be derived 
by one rule that requires both h w* tr and h e2 tr. Thus, 
by the induction hypothesis, h 62 tr. Then, by the rules for 
traceability, h w* [r] 62 tr, and by the traceability rules again, 
h (u'[r] 62)"' tr, as we are required to prove. 
In this case: 



{'"f 



■,1 



e = [vf' [T\ Vt ) 

Vf — (p'jfix /[a](a:::ri):r2.e") 

e' = p"(e"[rH)' 

p" ' " 

r 



let 7 / k 

= p J-.T =Vf^,X:Ti ^Vt 



(Va.ri — >■ T2) 
t[ = ri [r/a] 
tr{T^) = t (6) 



for some p', /, a, x, t\, T2, e", j, r, vt, k, and I. The first hy- 
pothesis is that h {vf-' [t] Vt'') tr. There is only one rule to de- 
rive that judgement and that rule requires that h ti/-'[r] vt'^ tr, 
which in turn can only be derived by one rule that requires both 
h VJ-' tr and h vt'^ tr. Both of these latter derivations can only 
be derived by one rule and those rules require that hv ti/:r (1) 
and hv vt:t (2) (a simple inspection reveals the traceabilities to 
be r and t). Judgement 1 can only be derived by one rule and 
that rule requires that h p' tr (3) and h e" tr (4). By (3), (1), 
tr{T') = r, (2), and (6) we can derive h p" tr (5). By (5) and 
(4) we can derive h p"(e") tr, and then h e' tr, as required. 

In this case, e = (boxr e")* for some r, e", and i, e' = 
(boxr e'")*, and (p, e") 1 — > (p, e'") is a subderivation. The 
first hypothesis is that h (boxr e")' tr. There is only one rule to 
derive this judgement and that rule requires that h boxr e" tr, 
which in turn can only be derived by one rule that requires that 
h e" tr. Thus, by the induction hypothesis, h e'" tr. Then, by 
the rules for traceability, h box^ e'" tr, and by the traceability 
rules again, h (box^ e'")' tr, as we are required to prove. 

In this case, e = (unbox e")' for some e" and i, e' = 
(unbox e'")*, and (p, e") \ — > (p, e'") is a subderivation. 
The first hypothesis is that h (unbox e")' tr. There is only 
one rule to derive this judgement and that rule requires that 
h unbox e" tr, which in turn can only be derived by one rule 
that requires that h e" tr. Thus, by the induction hypothesis, 
h e'" tr. Then, by the rules for traceability, h unbox e'" tr, 
and by the traceability rules again, h (unbox e'")' tr, as we 
are required to prove. 

In this case, e = (unbox (v^-.tY) for some r, v, i, j, and k, 

ande' = w' . The first hypothesis is that h (unbox (i;':r)"') tr. 
There is only one rule to derive this judgement and that rule 
requires that h unbox {v^:t) tr, which in turn can only be 
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I- n w/ 






h Xi-.Ti = «i'i, . . . ,Xn:T„ = «„'" : Xi:ti,. . . ,a;„:T„ 



hM : r 



\- p-.r 0;ri-e:r 
h (p,e) :r 



Figure 4. Type rules, other constructs 



derived by one rule that requires that h {v^ -.tY tr. There is only 
one rule to derive this latter judgement and that rule requires 
that hv {w*:r):t for some t, which in turn can only be derived 
by one rule that requires that hv v:tr{T). Then, by the rules for 
traceability, h v^ tr, as we are required to prove. 

• In this case, e = p'(e")* for some p', e" and i, e' = p'(e"'Y 
for some e'", and (p', e") i — > (p', e'"). The hypothesis h e tr 
can only be derived in a certain way, unpacking that we see 
that h p' tr and h e" tr. Then by the induction hypothesis, 
h e'" tr. So applyling the rules, we derive that h p'(e"') tr 
and then h e' tr, as required. 

• In this case, e — p'{v^y for some p', u, i, and j, and e' = «'. 
The hypothesis, h e tr can only be derived in one way and 
unpacking that we see that h v' tr, which is what we are 
required to prove. 



There is no corresponding progress property for our notion of 
traceability, since in the absence of further guarantees, programs 
can go wrong. However, typable programs are both traceable and 
do not go wrong, as we will see in the next section, and so preserv- 
ing typability ensures GC correctness. 

2.5 Typing 

The typing rules appear in Figures 4 and 5. They are for the most 
part standard except for three modifications. First, as types are la- 
belled, we must sometimes ignore the labels in typing. Judgement 



x:t e r 



A;ri-a;' :r 



A h (Vq.ti -^ T2y wf 
A,q; r, f:{\/a.Ti — )■ r2)*,a;:ri h e : T2 

A;r h (fix f[a](x:Ti):T2.ey : (\/a.Ti -> ra)' 

A; Fh ei ; (Va.n -> rf A; Fh 62 : r2 
A \- T wf tr{r) = r h ri[r/a] — T2 

A;Fh(ei[r]e2)':r'[r/Q] 

A\-Twf A;FI-e:r' h r = r' 

A;F h (box^e)' : box(r)' 

A;FI- e :box(r)^ 



A; F h (unbox e)' : r 

hp:F' 0;F'|-e:T 

A;Fhp(er:r 

A; F h c^ : B' 



h p : F' h (Va.n -^ T2y wf 
a; F', /:(Va.Ti — ;> r2)*, x:ti \- e : T2 

A;r h (p, fix f\a\{x:Tiy.T2.ey : (Va.ri — > r2)' 

h r w/ A; F h -u^' : r' h r = r' 

A;ri- (v^-.t)' :box(r)' 

Figure 5. Type rules, expressions 



\- Ti — T2 States that types ri and T2 are syntactically equiva- 
lent except that the labels on their sub-terms might differ. This is 
important in (for example) the rule for application, where we re- 
quire only that the parameter type ri and the actual argument type 
T2 satisfy h n = r2 rather than n = r2; similarly in the rule 
for environments. Second, in the rules for boxes we require that 
the traceability of the box equal the traceability of the type of the 
thing being boxed. This is essential for showing that well-typedness 
implies traceability. Finally, the instantiation rule for polymorphic 
functions enforces the property that the type argument have trace- 
ability r. 

One particularly important aspect of our language is that we 
assume a type erasure semantics. For this interpretation to be cor- 
rect, we must show that we can compute the correct GC metadata 
when erasing types. The operational semantics have the applica- 
tion of a polymorphic function step to a frame where the annota- 
tion on the function's parameter is a substituted type. We need that 
the GC metadata for this substituted type equal the GC metadata 
for the unsubstituted parameter type of the function. The require- 
ment tr{T) = r in the typing rule for application is crucial to that 
equality, and the following lemma proves it. 
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Lemma 2 

Iftr^r) = r then tr{T') 



tr{T [t/ci]). 



Proof: The proof is by inspection of the definitions. ■ 

We can prove type safety for this language in the standard 
way, via progress and preservations lemmas. First we need several 
lemmas: that type equality is an equivalence relation, that equal 
types have the same traceabilities, that a well-typed value has 
the same traceability as its type, that type equality respects type 
substitution, that value typing is independent of the typing context, 
and a type substitution lemma. 

Lemma 3 

Type equality is an equivalence relation, that is, h t = t,\- ti — 
T2 implies \- T2 = ti, and \- ti — T2 and \- T2 — T3 implies 

I- Tl = Ti. 

Proof: The proof is by a simple induction on the structure of r for 
reflexivity or the structure of the derivation(s) for symmetry and 
transitivity and inspection of the rules. ■ 

Lemma 4 

If h Tl — T2 then ir(ri) — tr(T2). 

Proof: The proof is by inspection of the last rule used. ■ 

Lemma 5 

If h Tl — T2 then \- Tifr/a] — r2[r/a]. 

Proof: The proof is by an easy induction on the derivation of 

I- Tl = T2. ■ 

Lemma 6 

If A; r h wt^ ; T then tr(T) = t. 

Proof: The proof is by inspection of the three rules for value 
typing. ■ 

Lemma 7 

If A; r\-v' -.T then A'; F' \- v' : t for any A' and F'. 

Proof: The proof is by any easy induction on the typing derivation 
and inspection of the three rules for value typing. ■ 

Lemma 8 

If A, a, A'; r h e : T, A h t' wf, and tr{T') = r then 
A, A'; r[T7a] h e[T'/a] : t\t' la\. 

Proof: The proof is a straight forward induction over the derivation 
of A, a, A'; r h e : T. It uses Lemma 2 in the case of the rule for 
application. ■ 

With all these lemmas we can prove Type Preservation and 
Progress. 

Lemma 9 (Type Preservation) 

If I- Ml : Tl and Mx 1 — > M2 then h M2 : T2 and F n = T2 for 
some T2 . 

Proof: Assume that h (p, ei) : ti and (p, ei) 1 — > (p, 62). We will 
show by induction on the derivation of the latter that h (p, 62) : T2 
and h Tl = T2 for some T2. By the typing rules, h p : F and 
0; r h ei : Tl for some P. By the typing rules, we just need to 
show that 0; F h 62 : T2 and h ti = T2 for some T2. Consider the 
cases, in the same rule as the figure, for the last rule used to derive 
the reduction: 



(Variable) In this case, ei = a;', 62 = «-', and x:t' = v-' £ p 
for some x, i, v, j, and t'. The typing judgement can only be 
derived with one rule and it requires that x:t G F. The typing 
judgement (for p) can only be derived in one way and it requires 
that T = t', 0; h w-* : t", and \- t — t" . Thus the desired T2 
is t". We just need to show that 0; F h ti-' ; T2, which follows 
by Lemma 7. 

(Fix expression) In this case, ei = (fix /[a](a:::Tj):T2.e')' 
and 62 = (p, fix f[a]{x:T[):T2.e'y. The typing judgement 
can only be derived with one rule and it requires that h ti wf, 
a;r, f:Ti,x:T[ h e' : T2 and ti — {Wa.T^ — > T2)*. Thus by 
the typing rules, 0; F h 62 : ti. By Lemma 3, h ti = ti, so 
the result follows by setting T2 = ti . 

(Box expression) In this case, ci = (boxr v^Y and 62 = 
{v^'.tY for some t, v, i, and j. The typing judgement can 
only be derived with one rule and it requires that h t ui/, 
0;F h «' : t', h T = t', and ti = box(T)^. By the typing 
rules, 0; F h e2 : n. By Lemma 3, h ti = ti, so the result 
follows by setting T2 — ti. 

(Application function) In this case, ei = (ealT] 64)', 62 = 
(eslr] 64)% and (p, 63) 1 — > (p, 65) for some 63, t, 64, and 65. 
The typing judgement can only be derived with one rule and it 
requires that 0; F h 63 : (Va.Ta -> t'Y , 0; F h 64 : T4, h 
T wf, tr{T) = r, h T3[T/a] = T4, and t = t'It/o] for some 
T3, t', j, and T4. By the induction hypothesis, 0; F h 65 : T5 
and h (Va.T3 — > t'Y ~ Tr, for some T5. There is only one rule 
to derive the latter and it requires that T5 — (Va.Tsi —>■ T52) , 
h T3 = T51, and \- t' — T52 for some T51, T52, and k. By 
Lemma 5, h T3[T/a] — T5i[T/a] and h t'It/o] = T52[T/a]. 
By Lemma 3, h T51 [t/o] = T4. So by the typing rules, 0; F h 
62 : T52 [t/q] . The result follows by setting T2 = T52 [t/o] . 

(Application argument) In this case, ei — (63 [t] 64)*, 62 = 
(e3[T] 65)', and (p, 63) 1 — > (p, 65) for some 63, 64, and 65. 
The typing judgement can only be derived with one rule and it 
requires that 0; F h 63 : (Va.Ts -^ t'Y, 0; F h 64 : T4, h 
T wf, trij) = r, h T3[T/a] = T4, and ti = t'[t /a] for some 
T3, t' , j, and T4. By the induction hypothesis, 0; F h 65 : T5 
and h T4 = T5. By Lemma 3, h T3 [t /a] — T5. So by the typing 
rules, 0; F h 62 : ti. By Lemma 3, h ti = ti, so the result 
follows by setting T2 — ti. 

(Application beta) In this case: 

ei = («i'H ^'2^) 

wi = (p',fix /[a](a.:T0:T^.6') 

e2 = p';(6'[T/«])'= 

p" = p',f:T' = vi\x:T[[T/a] = V2^ 

t' = {\^a.Ti ^ T^y 

for some p', /, a, x, Ti, T2, e' , i, V2, j, and k. Unpacking the 
typing judgement, which can only be derived in one way, 0; F h 
vY ■■ t' (1), h p' : F' (2), h t' u-/ (12), a; F', f:T',x:Ti h 
e' : T^ (3), Tl = T!,[T/a] (4), 0; F h V2' : Ti' (5), h 
T[[T/a] = t" (6), h T wf, and tr{T) — r for some 
F' and T2'. By (1) and Lemma 7, 0; h vi' : t' (7). By 
Lemma 3, h t' = t' (8). By (5) and Lemma 7, 0; h 
V2^ : T2' (9). By (2), (7), (8), (9), and (6), the typing rules 
give h p" ; F',/:T',2::T;[T/a] (10). By (3) and Lemma 8, 
0;(F',/:T',a;:TO[T/a] h 6'[T/a] : t^ [t/o] (II). By (2) 
and (12), by inspection of the typing rules, F'[T/a] — F' 
and t' [t/o] = t'. Thus, 0; F', /:t', a;:Ti [t/q] h 6'[t/q] : 
T2[T/a] (13). By (10) and (13), the typing rules give 0; F h 
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p"(e'[r/a])'' : T'2\Tla\ (14). By (4), the result follows by 
setting r2 — t^\t la\. 

• (Box argument) In this case, ei = (box^ e)\ 62 = (boxr e')\ 
and (p, e) 1 — > (Pjc') for some r, e, i, and e'. The typing 
judgement can only be derived with one rule and it requires that 
h r to/, 0; r h e : r', h r = r' and t\ — box(r)' for some 
r'. By the induction hypothesis, 0; F h e' : r" and \- t' — r" 
for some r". By Lemma 3, h r = r". By the typing rules, 
0; r h 62 : box(r)\ By Lemma 3, h n = n, so the result 
follows by setting T2 — ri. 

• (Unbox argument) In this case, ei — (unbox e)', 62 = 
(unbox e'Y, and (p, e) 1 — 5- (p, e') for some e, i, and e'. The 
typing judgement can only be derived with one rule and it re- 
quires that 0; r h e ; box(ri)-' for some j. By the induction 
hypothesis, 0; F h e' ; r' and h box(ri)-' — r' for some r'. 
The latter can only be derived with one rule and it requires that 
r' = box(r") and h n = r" for some r" and k. By the 

typing rules, 0; F h 62 : r", so the result follows by setting 

// 

T2 — T . 

• (Unbox beta) In this case, ei = (unbox (i;':r)"') and 62 — v'' 
for some r, v, i, j, and k. The typing judgement can only be 
derived with one rule and it requires that 0; F h {v^-.tY : 
box(ri) for some /. The latter can only be derived with one 
rule and it requires r = ri, 0; F h ti* ; r', and \- t — t' . So 
the result follows by setting T2 = r'. 

• (Frame step) In this case, ei — p'ief , £2 = p'(e'y, and 
(p', e) I — 5- (p', e') for some p', e, i, and e'. The typing judge- 
ment can only be derived with one rule and it requires that 
p' h F' : and 0; F' h e : ri for some F'. By the induction 
hypothesis, 0; F' h e' : r' and h ti = r' for some r'. By 

the typing rules, 0; F h 62 : r', so the result follows by setting 
/ 

T2 — T . 

• (Frame return) In this case, ei = p'{v') and 62 — w* for some 
p', V, i, and j. The typing judgement can only be derived with 
one rule and it requires that p' h F' : and 0; F' h u' : tj. By 
Lemma 7, 0; F h u' : ri . By Lemma 3, h ri = n , so the result 
follows by setting T2 — ti. 



Lemma 10 (Progress) 

IfhM-.T then either M has the form (p, v^ 
some M' . 



or M 



M' for 



Proof: The result follows from: If h p : F and 0; F h e : r then 
either e has the form w* or (p, e) 1 — )> (p, e') for some e'. We will 
prove this by induction on the typing derivation for e. Consider 
the last rule, in the same order as the figure, used to derive the 
judgement: 

• (Variable) In this case e — x"^ and x:t £ F. There is only one 



rule to derive h p : F and it requires that x:t 



G p and 



other conditions for some v and j. Then by the variable rule, 

(p, e) I — s- V-' , as required. 

• (Fix expression) In this case e — (fix f[a]{x:Ti):T2.e'y. 
Clearly by the fix expression rule: 

(p, e) I — > (p,(p, fix f[a]{x:Ti):T2.e'y) 

• (Application) In this case, e — (ei[r'] 62)*. The typing rule 
requires that 0; F h ei : (ia.Ti -^ T-^y (1), 0; F h 62 : r2 
(2), and h ri[r'/a] = T2 (3) for some n, j, and T2. By the 



induction hypothesis, either ei is a value or reduces, and 62 is a 
value or reduces. There are three subcases: 

■ Case 1, ei reduces: In this case there is e'l such that 
(p, ei) I — > (p, e'l). Then by the application function rule, 

(p, e) I — > (p, (e'i[r'] 62)'), as required. 

■ Case 2, ei is a value and 62 reduces: In this case there is 
e'2 such that (p, 62) I — >■ (p, 62). Then by the application 
function rule, (p, e) 1 — )■ (p, (ci[r'] 62)*), as required. 

■ Case 3, ei = vi'' and 62 — V2 for some v\, k, V2, and I: 
There is only one typing rule to derive (1) and it requires that 
«i have the form (p', fix f[a]{x:Ti):T3.e') for some p', /, 
X, and e'. Let t be the traceability of V2. By Lemma 6 and 
(2), tr{T2) = t. By Lemma 4 and (3), ir(ri[r7a]) = f- 
Then by the application beta rule: 

(p,e)^ 
{p,{p',f:T'^v^\x:T4T'/a]=vy){e'[T'/a]y) 

where r' = (Va.ri — > rs) , as required. 

(Box expression) In this case, e = (box^/ e')* for some r', 
e', and j. The typing rule requires that r = box(r')', 0; F h 
e' : r" (1), and \- t' — r" (2) for some r". By the induction 
hypothesis, either e' is a value or reduces: 

■ If e' = vy then by Lemma 6 and (1), trir") = t. By (2) 
and Lemma 4, trij') = t. So by the box reduction rule, 

(p, e) I — > (p, {vt'^ :t') ), as required. 

■ If (p, e') I — > (p, e") then (p, e) 1 — ^ (p, (box^. e")'), as 
required. 

(Unbox) In this case, e = (unbox e')' for some e and i. The 
typing rule requires that 0; F h e' ; box(r)-' (1) for some j. By 
the induction hypothesis, e' is a value or reduces: 

■ If e' = v^ then (1) can be derived by only one rule and 
it requires that v = {u' :r') for some v' , I, and r'. By the 
unbox beta rule, (p, e) 1 — > (p, v' ), as required. 

■ If (p, e') I — > (p, e") then by the unbox argument rule, 
(p, e) I — > (p, (unbox e")'), as required. 

(Frame) In this case, e = p'(e')' for some p', e', and i. The 
typing rule requires that h p' : F' and 0; F' h e' : r for some 
F'. By the induction hypothesis, e' is a value or reduces: 

■ If e' = V-' then by the frame return rule, (p, e) 1 — > (p, v-'), 
as required. 

■ If (p, e') I — > (p, e") then by the frame step rule, (p, e) 1 — y 
(p, p'{e"y), as required. 



(Constant) In this case e ■ 
value. 



c* for some c and i and is clearly a 



(Fix value) In this case e = (p',fix /[a](a;:ri):T2.e')* for 
some p', /, a;, ri, T2, e! and i and is clearly a value. 



(Box value) In this case e 
and is clearly a value. 



(w*:r') for some r', v, i, and j 



We can also prove that typability implies traceability and thus 
typable programs are GC safe and remain so throughout execution. 

Lemma 11 

• If\- M -.T then h M tr. 

• If h p : F tiien h p tr. 

• If A; F h e : r tiien h e tr. 
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• If A; r h «' : r then K v:tr{r). 

Proof: The results are proven simultaneously by induction on the 
structure of the typing derivation. The cases for the last rule used, 
in the same order as the figure, are: 

• (Variable) In this case clearly h c tr. 

• (Fix expression) In this case e = (fix f[a]{x:Ti):T2.e'y 
for some x, a, ri, T2, e', and i. Then by the typing rule, 
A, a; r, f-.T, x:ti F e' : r2 is a subderivation. By the induction 
hypothesis, h e' tr. So by the rules for traceability, h e tr, as 
required. 

• (Application) In this case e = (ei[r'] 62)* for some ei, r', 62, 
and i. By the typing rule. A; F h ei : ri and A; F h 62 : r2 
for some ri and T2. By the induction hypothesis, h ei tr and 
h 62 tr. By the rules for traceability h e tr, as required. 

• (Box expression) In this case e — (box^/ e')' for some r', e', 
and i. By the typing rule. A; F h e' : r" for some r". By 
the induction hypothesis, h e' tr. By the rules for traceability, 
h e' tr, as required. 

• (Unbox) In this case e = (unbox e')* for some e' and i. By 
the typing rule. A; F h e' : r' for some r'. By the induction 
hypothesis, h e' tr. By the rules for traceability, h e tr, as 
required. 

• (Frame) In this case, e — p{e'y for some p, e', and i. By the 
typing rule, h p : F' and 0; F' h e' : r for some F'. By the 
induction hypothesis, h p tr and h e' tr. By the rules for 
traceability, h e tr, as required. 

• (Constant) In this case e = c*. By the typing rule, t — B^ and so 
clearly tr{r) = b. By the rules for traceability, hv c:b, proving 
the fourth result. By the rules for traceability again, h e tr, 
proving the third result. 

• (Fix value) In this case e = (p,fix /[a](a;:ri):r2.e')* for 
some p, /, a, x, t\, T2, e', and i. By the typing rule, h p : F' 
and a; F', /;r, x:t\ \- e' : T2 for some F'. Also by the typing 
rules, r is a function type, so tr(T) = r. By the induction 
hypothesis, h p tr and h e' tr. By the rules for traceability, 
hv (p, fix /[a](a;:ri):r2.e');r, proving the fourth result. By 
the rules for traceability again, h e tr, proving the third result. 

• (Box value) In this case e = {v^-.t'Y . By the typing rule, 
A; F h «' : r" and h r' = r" for some r". Also by the 
typing rule, r is a box type, so trij) = r. By the induction 
hypothesis, hv v.trij"). By Lemma 4, hv v:tr{T'). By the 
rules for traceability, hv («':r'):r, proving the third result. By 
the rules for traceability again, h e tr, as required. 

• (Environment) In this case p = xi:ti 
u„'". By the typing rule, 0; 

for 1 < j < n and some rjs. By the induction hypothesis, 
hv Vj:tr{Tj) for 1 < J < n. By Lemma 4, trijj) — tr{Tj) 
for 1 < j < n. Thus hv Vj:tr{Tj) for 1 < j < n. By the 
traceability rules, h p tr, as required. 

• (Machine state) In this case M — (p, e). By the typing rule, 
h p : F and 0; F h e : r for some F. By the induction 
hypothesis, h p tr and h e tr. By the rules for traceability, 
h M tr, as required. 



3. Flow analysis 

Our original motivation for this work was to apply interprocedural 
analysis to the problem of eliminating unnecessary boxing in pro- 



f 1 , . . . , Xn-Tn — 

h v^i : Tj and h tj = rj 



grams. There is a vast body of literature on interprocedural anal- 
ysis and optimization, and it is generally fairly straightforward to 
use these approaches to obtain information about what terms flow 
to what use sites. This paper is not intended to provide any con- 
tribution to the algorithmic side of this body of work, which we 
will broadly refer to SLsflow analysis. Our contribution in this paper 
lies in showing how to extend flow analysis to the type level, and 
showing that any generic flow analysis so extended can be used to 
implement an unboxing optimization that preserves type safety. 

In order to do this, we must provide some framework for de- 
scribing what information a flow analysis must provide. For the 
purposes of our unboxing optimization, we are interested in find- 
ing (interprocedurally) for every (unbox v-' ) operation the set of 
(boxr e) terms that could possibly reach v. Under appropriate 
conditions, we can then eliminate both the box introductions and 
the box elimination, thereby improving the program. The core lan- 
guage defined in Section 2 provides labels serving as proxies for 
the terms, types, and variables on which they occur ~ the question 
above can therefore be re-stated as finding the set of labels k that 
reach the position labeled with j. 

More generally, following previous work we begin by defining 
an abstract notion of analysis. We say that an analysis is a pair 
(C, q). Binding environments q simply serve to map variables to 
the label of their binding sites. The mappings are, as usual, global 
for the program. Consequently, a given environment may not apply 
to alpha- variants of a term. We do not require that labels be unique 
within a program — as usual however, analyses will be more precise 
if this is the case. Variables are also not required to be unique 
(since reduction may duplicate terms and hence binding sites). 
However, duplicate variable bindings in a program must be labeled 
consistently according to g or else no analysis of the program 
can be acceptable according to our definition. This can always be 
avoided by alpha- varying or relabeling appropriately. 

A cache C is a mapping from labels to sets of shapes. Shapes 
are given by the grammar: 



Shapes: s :: — 



B' 



(Vi.j 
(Vi.j 



OXtl 



k)' 1 (b 
■k)]\{hox^)l 



There are two classes of shapes — term shapes and type shapes. 
The idea behind term shapes is that each shape provides a proxy 
for a set of terms that might flow to a given location, describing 
both the shape of the values that might flow there and the labels of 
the sub-components of those values. For example, for an analysis 
(C, g), c* £ C{k) indicates that (according to the analysis) the 
constant c, labeled with i, might flow to a location labeled with 
k. Similarly, if (Vi'.i -^ j)^ G C(Z), then the analysis specifies 
that among the values flowing to locations labeled with I might 
be functions labeled with k, whose type parameter is labeled with 
i', parameter type is labeled with i, and whose bodies are labeled 
with j. If (boxt k)l £ C(/) then among the values that might flow 
to I (according to the analysis) are boxed values labeled with i, 
with meta-data t and whose bodies are labeled by some j such that 
C(j) c C(fc). 

Where term shapes provide a proxy for the set of values that 
might flow to a given location, type shapes provide a proxy for 
the types of the locations that values might flow through to get to 
a given location. For example, for an analysis (C, g), B* G C{k) 
indicates that (according to the analysis) objects that reach location 
k might flow through a variable or term of type B, labeled with i. 
The function type and box type shapes similarly correspond to the 
flow of values through locations labeled with function or box types. 

It is important to note that the shapes in the cache may not 
correspond exactly to the terms in the program, since reduction may 
change program terms (e.g. by instantiating variables with values). 
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However, reduction does not change the outer shape and labeling 
of values — it is this reduction invariant information that is captured 
by shapes. 

Clearly, not every choice of analysis pairs is meaningful for pro- 
gram optimization. While in general it is reasonable (indeed, un- 
avoidable) for an analysis to overestimate the set of terms associ- 
ated with a label, it is unacceptable for an analysis to underestimate 
the set of terms that flow to a label — most optimizations will pro- 
duce incorrect results, since they are designed around the idea that 
the analysis is telling them everything that could possibly flow to 
them. In order to capture the notion of when an analysis pair gives 
a suitable approximation of the flow of values in a program we 
follow the general spirit of Nielson et al. [7], and define a notion 
of an acceptable analysis. That is, we give a declarative specifi- 
cation that gives sufficient conditions for specifying when a given 
analysis does not underestimate the set of terms flowing to a label, 
without committing to a particular analysis. We arrange the subse- 
quent meta-theory such that our results apply to any analysis that is 
acceptable. In this way, we completely decouple our optimization 
from the particulars of how the analysis is computed. 

Our acceptable-analysis relation is given in Figures 6 and 7 — 
the judgement C; £i h (p, e) determines that an analysis pair (C, g) 
is acceptable for a machine state (p, e), and similarly for the envi- 
ronment and expression forms of the judgement. We use the nota- 
tion Ibl(e) to denote the outermost label of e: that is, i where e is of 
the form m' or ii*. The acceptability judgement generally indicates 
for each syntactic form what the flow of values is. For example, 
in the application rule, the judgment insists that for every function 
value that flows to the applicand position, the set of shapes associ- 
ated with the parameter of that function is a super-set of the set of 
shapes associated with the argument of the application; and that the 
set of shapes associated with the result of the function is a sub-set 
of the set of shapes associated with the application itself. 

The judgement C; q\- t determines that an analysis pair (C, g) 
is acceptable for a labeled type r. In particular, if a function flows 
to a function type Vri.r2 — >■ then the set of values that flow to the 
function's parameter can flow to the argument type ri, and the set 
of values that can flow from the result of the function can flow to 
the result type T2. And similarly for box types. 

Given this definition, we can show that the acceptability relation 
is preserved under reduction. First we show that the cache is only 
refined by reduction. 

Lemma 12 (Cache refinement under reduction) 

If C; p h p, C; £< h ei, and (p, ei) i — > (p, 62) then C(lbl(ei)) D 
C(lbl(e2)). 

Proof: The proof is by induction on the derivation of (p, ei) 1 — > 
(p, 62). Consider the cases for the last rule used to it (the cases are 
in the same order as in the figure): 

• (Variable instantiation.) In this case, ei = x^ , 62 = v\ and 
x:t = V-' £ p. The assumption C; p h p requires that C(j) C 
C(lbl(r)) and q{x) = Ibl(r). The assumption C; g h ei 
requires that C{g{x)) C C(fc). Thus C(j) C C(fc). Clearly, 
Ibl(ei) = k and lbl(e2) — j and the result follows. 

• (Fix introduction.) In this case, clearly Ibl(ei) — lbl(e2) and 
the result immediately follows. 

• (Box introduction.) In this case, clearly Ibl(ei) — lbl(e2) and 
the result immediately follows. 

• (Application left.) In this case, clearly Ibl(ei) — lbl(e2) and 
the result immediately follows. 

• (Application right.) In this case, clearly Ibl(ei) — lbl(e2) and 
the result immediately follows. 



funC{i,j,k,l) boxC{i,j) 



funC{i,j,k,l) = 

AV(v/.fc'^r):'ec(i): 

C(i) = C(j') A C(fc) C C(fc') A C{1') C C{1) 

/\y{Wj' .k' -^ I'Y^ e c{i) : 
c(i) = C(j') A C(fc) c C(fc') A C{1') c c(0 
boxC{i,j) — 

AV(boxtj'K'GC(i):C(/)CC(j) 

AV(boxj')f GC(i):C(j')CC(j) 



C;g\-e 



C{g{x)) C C(i) 



C;g\-x' 

Q{f) = i g{x) = \h\{Ti) C; 5 h (Va.n ^ r2)' 
C;ghe (Ve(Q).lbl(ri) ^ Ibl(e))^ € C{i) 

C;g\- (fix f[a]{x:Ti):T2.ey 

C; g\- e\ C; g\- t C; g\- €2 
/unC(lbl(ei), Ibl(r), lbl(e2), i) 

C;gh{ei[r]e2y 
C; p h box(r)' C; p h e 

ibOX,r(r)j):^C{i) C(lbl(e)) C C(j) 

C; p h (boxT e)* 
C;g\-e boxC {lU{e) , i) 



C; p h (unbox e)* 

C;pl-p C;pl-e C(lbl(e)) C C(i) 
C; p h p(ey 

C; p h B' c' e C(i) 



C; p h c* 

Q{f) = i g{x) = \h\{Ti) C; p h (Va.n ^ r2)' 

C; p h p C; p h e 

(Vp(Q).lbl(ri) ^ Ibl(e)): e C(i) 

C; p h (p, fix f[a\{x:Ti):T2.ey 
C; p h box(r)' C; p h v' 

(boxf,(,) ky e C(i) C(j) c c(fc) 

C;pl- (v^-.tY 
Figure 6. Acceptable Analysis, Expressions 
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C;^hr 



C;ghr 



Cioia)) = C{i) 
C;g\-a' 

B' e C(i) 

C;ghB' 

C; ^ h ri C; gh T2 

(Vf>(Q).lbl(ri) ^ lbl(r2));' e C(j) 

/KnC(i,g(Q),lbl(ri),lbl(r2)) 

C; f) h (Va.n ^ ra)" 

C; e h r (box Ibl(r))^' G C(i) 6oiC(i, Ibl(r)) 

C; £1 h box(r)* 

VI < j < n : £>(a:j) = Ibl(rj) A C; g \- tj 
C;g\- xi-.Ti, . . . ,x„:t„ 



C;g^p 



C;gh xi-.Ti, . . . ,a;„:r„ 

VI < j < n : C{ij) C C(lbl(r,)) AC; g'r Vk"' 

C;g\- xr.Ti = «i'S . . . ,a;„:r„ = u„'" 



C;gh M 



C; g\- p C; g\- e 
C;gh (p,e) 



Figure 7. Acceptable Analysis, Other Constructs 

' (Application beta.) In this case, clearly Ibl(ei) — lbl(e2) and 
the result immediately follows. 

' (Under box.) In this case, clearly Ibl(ei) = lbl(e2) and the 
result immediately follows. 

' (Under unbox.) In this case, clearly Ibl(ei) = lbl(e2) and the 
result immediately follows. 

' (Unbox beta.) In this case, ei = (unbox (w*:r)"') and 62 = «*. 
The first hypothesis can be derived only by one rule and it 
requires that C; gh {v'':t)'' (I), and boxC{j, k) (2). Judgement 
1 can only be derived by one rule and it requires that C; g\- v^ 
(4), (boxtr(^) i")l e C(j) (5) for some i", and C(i) C C(i") 
(6). Instantiating Fact 2 with Fact 5 we get that C(i") C C(fc) 
(7). Combining Facts 6 and 7, C(i) C C(fc), as we are required 
to prove. 

' (Under frame.) In this case, clearly Ibl(ei) — lbl(e2) and the 
result immediately follows. 

' (Frame return.) In this case, ei = p' («')"' and 62 = «'. 
The assumption C; £> h ei requires that C(i) C C(j). Since 
Ibl(ei) = j and lbl(e2) = i, the result is immediate. 

■ 
Next we show a type substitution lemma for acceptability. 



Lemma 13 

If C; g\-Tand C(lbl(r)) = C{g{a)) then: 

• IfC; ghr' then C;g\- t'It/u]. 

• IfC; g\-r then C;gh r[r/a]. 

• IfC; g\- e then C; g \- e[T/a]. 

• IfC; g\- p then C; g\- p\t la\. 

Proof: The proof is by induction on the derviation of the C; £i h r' 
and C;g\- e. Consider the cases for the rules used to derive it (in 
the same order as in the figures): 

• The cases for expressions are straight forward. 

• (Type variable) In this case r' = /3*. If /? 7^ a then t'\t la\ = 
t' and the result is immediate. Otherwise, by the rules for 
acceptability, C{g{a)) — C{i). If r = u-' then t'It/o] = a'. 
Consider the cases for a: 

■ Subcase I, a = a': Then by the rules for acceptability, 

C(f>(a')) = C(j). Since C(i) = C{g(a)) = C(i), 
C{g{a')) = C(i), and thus C; g\- a", as required. 

■ Subcase 2, a = Va'.ri — > T2: Since C; g \- t, the rules 
require: 

C; e h n (1) 

C;ghT2 (2) 

{Wg{a').lhl{rr) ^ lbl(r2)),'= £ C(j) (3) 

MC(j, ()(«'), lbl(n),lbl(r2)) (4) 

By(3)andC(j) = C(i): 

(V£?(Q').lbl(ri) ^ lbl(r2))' G C{i) (5) 

By(4)andC(j) = C(i): 

/unC(i, £.(Q'),lbl(^i),lbl(^2)) (6) 

By (1), (2), (5), and (6), by the rules for acceptability, 

C;g\-a\ 

■ Subcase 3, a = box(r"): Since C; g\- t, the rules require: 

C;ghr" (1) 

(boxlbl(r")),'eC(i) (2) 



6oxC(j,lbl(r")) 



(3) 



//\\k 



By (2) and C(j) = C(i), (box lbl(r")), G C{i) (4). By 
(3) and C(j) = C(i), boxC{i, lbl(r")) (5). By (1), (4), and 
(5), by the rules for acceptability, C; g\- ct*, as required. 

(Base type) In this case t'It/q] = r and the result is immedi- 
ate. 

(Function type) In this case r' — (Va'.ri —5- r2)\ The rules for 
acceptability require: 

C; f? h ri (1) 

C;ghT2 (2) 

(V£.(a')-lbl(ri) ^ lbl(r2))' G C{i) (3) 

funC{i, g(a'),\U{Ti),\U{T2)) (4) 

By (1), (2), and the induction hypothesis: 

C;£.hri[r/Q] (5) 
C;g^T2[T/a] (6) 

Sincelbl(ri[r/a]) = Ibl(ri) and lbl(ri[r/Q]) = Ibl(ri): 

(V£.(Q').lbl(ri[r/a]) ^ lbl(r2[T/Q]))^*^ G C(i) (7) 
funC{i, e(Q'),lbl(ri[r/a]), lbl(T2[r/a])) (8) 

Since lbl(r'[r/a]) = i, by (5), (6), (7), and (8), C; g \- 
t'It/o], as required. 
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• (Box type) In this case, r' — box(r")'. The rules for accept- 
abihty require: 

C;ghT" (1) 

(boxlbl(r"))5'' GC(i) (2) 
boxC{i,lhl{T')) (3) 

By (1) and the induction hypothesis: 

C;g\-r"[T/a] (4) 

Sincelbl(r"[r/a]) = lbl(r"): 

(box lbl(r"[r /«])),*= eC(i) (5) 
boxC{i,lU{T"[T/a\)) (6) 

Since Ibl(r'[r/a]) = i, by (4), (5), and (6), C;o\- t'It/o], as 
required. 

• The cases for type and value environments are straight forward. 

■ 
With these lemmas we can prove that reduction preserves ac- 
ceptability of the flow analysis. 

Lemma 14 (Preservation of acceptability under reduction) 

If C; gh M andM < — > M' then C;g\- M' . 

Proof: If C; g \- {p, e) then C; g h p and C; g \- e. If (p, e) i — y 
(p, e') then the result follows if we show that C; g h e' . The proof 
of the latter is by induction on the derivation of (p, e) i — > (p, e'). 
Consider the cases for the last rule used to derive it (the cases are 
in the same order as in the figure): 



• In this case e — x 



and x:t — v-' £ p. The 



assumption C; g \- p requires that C; g h «■', which is what 
we need to prove. 

• In this case e — (fix f[a]{x:Ti):T2 ■e"y for some /, a, x, 
Ti, T2, e", and j, and e' — (p, fix f[a]{x:Ti):T2.e"y. Let 
i = Ibl(Ti). The first hypothesis can only be derived by one 
rule and it requires that g{f) — j, g{x) — i, C; g \- t where 
r = {\fa.Ti -^ T2Y, C;gV- e" , and (Ve(Q).i -^ lbl(e"))^ G 
C(j). Then, and noting C; g h p by assumption, by the rules 
for acceptable analysis, C; g\- {p, fix f[a]{x:Ti):T2.e")'' , as 
we are required to prove. 

• In this case e = (box^ «')"' for some r, v, i, and j, and 
e' — (u':r)"'. The first hypothesis can only be derived by 
one rule and it requires that C; g h box(r)-', C; g h u', 
(boX(r(T) ky £ C(j) for some k, and C(i) C C(fc). Then 

by the rules for acceptable analysis, C; g h {v^:t)\ as we are 
required to prove. 

• In this case e = (ci[r] 62)* for some ei, r, 62, and i, 
e' = (ei[r] 62)*, and (p, ei) 1 — t- (p, e'l) is a subderivation. 
The first hypothesis can only be derived by one rule and it 
requires that C; g h ei (1), C; g \- r (7), C; g \- e2 (2), 
and /unC(lbl(ei), Ibl(T), lbl(e2), i) (3). By the induction hy- 
pothesis and Judgement I, C; g h e'l (4). By Lemma 12, 
C(lbl(ei)) C C(lbl(ei)) (5). Combining Facts 3 and 5, 
/unC(lbl(e'i),lbl(r),lbl(e2),i) (6). Combining Facts 4, 7, 2, 
and 6, and using the rules for acceptable analysis, we see that 
C; g\- (e'l [r] 62)*, as we are required to prove. 

• In this case e = (u^[r] 62) for some v, j, r, 62, and i, 

e' — {v-'[t] 62) , and (p, 62) 1 — s- (p, 62) is a subderiva- 
tion. The first hypothesis can only be derived by one rule 
and it requires that C; g \- v-' (1), C; g \- t (7), C; g \- 62 



(2), and /MnC(j, Ibl(r), lbl(e2),i) (3). By the induction hy- 
pothesis and Judgement I, C; g h e'2 (4). By Lemma 12, 
C(lbl(e'2)) C C(lbl(e2)) (5). Combining Facts 3 and 5, 
/MnC(j, lbl(r),lbl(e2),i) (6). Combining Facts 1, 7, 4, and 
6, and using the rules for acceptable analysis, we see that 
C; g\- {v^ [t] e'2) , as we are required to prove. 
• In this case: 

e= {vi^[t] V2''f 

vi = (p',fix f[a]{x:Ti):T2.e") 

e' = p"(e"[r/a])' 

p" = p', f-.T = vi^ , x:ti [t/o] = W2* 



(Va.ri 



T-2 



for some p', /, a, x, ri, T2, e", j, r, V2, k, and I. The first 
hypothesis can only be derived by one rule and it requires 
that C;g \- vi^ (1), C; g h t (2), C; g h V2^ (3), and 
/unC(j, Ibl(T), fe,/) (4). Let i — Ibl(ri). Judgement 1 can 
only be derived by one rule and it requires that g{f) = j 
(5), g{x) = i (6), C;g h r' (7), C; g \- e" (8), and 
{\/i.g{a) -^ lbl(e"))^ G C(j) (9). Instantiating Fact 4 with 
Fact 9, C(lbl(T)) = C{g{ay (10), C(fc) C C(i) (II), and 
C(lbl(e")) C C(/) (12). Judgement 7 requires that C; g \- n 
(13). By (13), (2), and (10), by Lemma l3,C;g\-Ti [r/a] (14). 
Since C;gh p, (5), (7), C(j) C C(j), (I), lbl(ri[r/a]) = 
Ibl(ri) = i and (6), (14), (II), and (3), we can derive C;g\- p" 
(15). By (13), (2), and (10), by Lemma 13, C; g h e"[T/a] 
(16). By (15), (16), and lbl(e"[r/a]) = lbl(e") and (12), we 
can derive C; g h e , as required. 

In this case e = (box^ ei)' for some t, e\, and i, e' = 
(boxr 62)* for some 62, and (p, ei) 1 — > (p, 62) is a subderiva- 
tion. The first hypothesis can only be derived by one rule and it 
requires that C; £) h box(r)* (7), C; £< h ei (1), (boXir(T) jT £ 
C(i) (2) for some j, and C(lbl(ei)) C C(j) (3). By the induc- 
tion hypothesis and Judgement I, C; £i h 62 (4). By Lemma 12, 
C(lbl(e2)) C C(lbl(ei)) (5). Combining Facts 3 and 5 gives 
C(lbl(e2)) C C(ji) (6). Then by Facts 7, 4, 2, and 6, and using 
the rules for acceptable analysis, C; £> h (boxr 62)', as we are 
required to prove. 

In this case e = (unbox ei)' for some e\ and i, e' = 
(unbox 62)' for some 62, and (p, ei) 1 — > (p, 62) is a sub- 
derivation. The first hypothesis can only be derived by one rule 
and it requires that C; £i h ei (1) and boxC {\h\{e\) , i) (2). By 
Judgement (I) and the induction hypothesis, C; g\- 62 (3). By 
Lemma 12, C(lbl(e2)) C C(lbl(ei)) (4). Combining Facts 4 
and 2, 6oa;C(lbl(e2),i) (5). Combining Facts 3 and 5, by the 
rules for acceptable analysis, C; £< h (unbox 62)', as we are 
required to prove. 

In this case e — (unbox (v'-.tY) for some t, v, i, j, and k, 
and e' = n*. The first hypothesis can only be derived by one 
rule and it requires that C; g h {v'-.tY , which in turn can only 
be derived by one rule that requires that C; g \- v" , as we are 
required to prove. 

In this case e = p [e'Y , e = p (e"y , and [p ,e") 1 — > 
(p', e'") is a subderivation. Assumption C; g\- e requires that 
C; £1 h p' (I), C;g V- e" (2), and C(lbl(e")) C C(i) (3). 
By (1), (2), and the induction hypothesis, C; g V- e'" (4). By 
Lemma 12, C(lbl(e"')) C C(lbl(e")) (5). Combining (3) and 
(5), C(lbl(e"')) C C(i) (6). Using (I), (4), and (6) we derive 
C; £< l~ p (e"y , as required. 
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' In this case e — p'{v^y and e' — «'. The assumption C; gh e 
unpacks to requiring that C; g \- v^ , as required. 



Lemma 15 (Many-step reduction preserves acceptability) 

If C; gh M andM < — >* M' then C;g\- M' . 



(Fix value) In this case, e = (p, fix f[a]{x:Ti):T2.e'y, t = 
(Va.ri -J- T2y. The first part holds as Ibl(e) = Ibl(r). The 
second part is required hy C; g \- e. 

(Box value) In this case e = {v-'-.r') and r = box(r')\ The 
first part holds as Ibl(e) — Ibl(r). The second part is required 
by C; £> h e. 



Proof: The proof is by a straightforward induction on the length 
of the reduction sequence and Lemma 14. ■ 

We can also show an important connection between typing and 
acceptable flow analysis — namely that the cache of an expression's 
type is a contained in the cache of that expression. 



Lemma 16 

If A;r I- e ; t, C;g 
C(lbl(e))andC;el-r. 



h r, and C;g\-e then C(lbl(r)) C 



Proof: The proof is by induction on the derivation of F h e : r. 
Consider the cases for the last rule used (in same order as figure): 

• (Variable) In this case e — x^ and x:t G F. By the rules for 
acceptable analysis, g{x) = Ibl(r) (1), C; g \- t (2), and 
C{g{x)) C C(i) (3). By (1) and (3), C(lbl(r)) C C(i) (4). 
The result is (4) and (2). 

• (Fix expression) In this case, e = (fix f[a]{x:Ti):T2.e'y 
and r = (Va.ri — >■ r2)*. The first part is immediate since 
Ibl(r) — Ibl(e). The second part is required by C; £> h e. 

• (Application) In this case: 



e = (ei[r'] 62)* 
A; F h ei : (\/a.Ti 
r = Ts[T'/a] 



r^y (1) 



By the rules for acceptability, C; g \- e\ (2), C; g \- t' (3), 
C;^ h 62, and /MnC(lbl(ei), Ibl(r'), lbl(e2),i) (4). By (1), 
(2), and the induction hypothesis, C(j) C C(lbl(ei)) (5) and 
C; g \- (Vq.ti — > Tj,y (6). By (6) and the rules for accept- 
ability, C; f. h rs (7) and (Vf?(a).lbl(ri) -^ lbl(r3))j'= £ C(j) 
(8). By (5), instantiating (4) with (8), C(lbl(r')) = C{Q{a)) 
(9) and C(lbl(ra)) C C(i), so since Ibl(r) = Ibl(ra), 
C(lbl(r)) C C{i) (10). By (3), (7), and (9), C; g h r-yr' /a] 
(11). The result is (10) and (11). 

(Box expression) In this case e = (box,-' e')* and r = 
box(r')'. The first part holds as Ibl(e) = Ibl(r). The second 
part is required hy C; g\- e. 

(Unbox) In this case e = (unbox e')* and A; F h e' : box(r)"' 
(1) is a subderivation. By the rules for acceptability, C; g \- 
e' (2) and hoxC{Vo\{e'),i) (3). By (1), (2), and the induc- 
tion hypothesis, C(j) C C(lbl(e')) (4) and C; £> h box(r)^ 
(5). By (5) and the rules for acceptability, C; g \- t (6) and 
(boxlbl(r))j'' G C(j) (7). By (4), instantiating (3) with (7), 
C(lbl(r)) C C(i) (8). The result is (8) and (6). 

(Frame) In this case e = p(e')\ h p : F' (1), and 0; F' h e' : r 
(2). By the rules for acceptability, C\ q \- p (3), C; g \- e' (4), 
and C(lbl(e')) C C(i) (5). By (1), (3), the rules for typing, 
and the rules for acceptability, C\ g \- V (6). By (6), (2), (4), 
and the induction hypothesis, C(lbl(T)) C C(lbl(e')) (7) and 
C; g h r (8). By (7) and (5), C(lbl(r)) C C(i) (9). The result 
is (9) and (8). 

(Constant) In this case e = c^ and r = B*. The first part 
clearly holds as Ibl(e) = Ibl(r). The second part is required 
by C; £> h e. 



4. Unboxing 

The goal of the unboxing optimization is to use the information pro- 
vided by a flow analysis to replace a boxed object with the contents 
of the box. Doing so may change the traceability, since the object in 
the box may not be a GC-managed reference. Moreover, the object 
in the box may itself be a candidate for unboxing; consequently, 
determining the traceability of boxed objects depends on exactly 
which objects are unboxed. Function parameters may be instanti- 
ated with objects from multiple different definition sites, some of 
which may be unboxed and some of which may not. 

Consider again the first example from Section 2, written out 
with explicit type information and labels: 



let 



0^1^ 



fix /0(a;:box(B ) ):box(box(B 



jIOnH 



•(b0Xb„,(B5)6 X ) 



zi = (boXj9 3^" 

Z2 = /D(^l^')'' 

1 R 17 

in (unbox (unbox 2:2 '') ) 



It is fairly easy to see that this program is unboxable. The binding 
site for x is only reached by the term labeled with 11 (the outer 
box introduction), and hence there should be no problems with 
changing its type annotation. Each box elimination is reached only 
by a single box introduction, and hence the box/unbox pairs in this 
program should be eliminable, yielding an optimized program: 

let 

fix fW{x:B^y.B\x'' 

inZ2^'^ 

Notice that in order to rewrite the program, we have had to change 
the types of both / and x, since we have eliminated the box 
introductions on the argument and in the body of /. The change 
in type of x has changed its traceability from r to b. If we choose 
(perhaps because of limitations on the precision of the analysis, or 
perhaps because of other constraints) to only eliminate the outer 
box/unbox pair, then we must similarly adjust types on the the 
remaining box introduction (labeled with 8). 

let 

fix /Q(a;:B°):box(B')l(boxB5 x''f 



21 = 3i« 
in (unbox Z2^''] 



12x13 
.16 



Clearly then, to optimize these programs in a type preserving 
fashion, we must rewrite types along with the terms in the pro- 
gram. In the rest of this section, we first develop a framework for 
specifying an unboxing assignment regardless of any correctness 
concerns, and then separately define a judgement specifying when 
such an assignment is a reasonable one. 
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Jrh 



JbUt = b' 

J(VQ.ri^r2)nT = (Va.JriLx 



\T2[~c) 



Jbox(r)nT 
Jbox(r)nT 


= box(jTb)' J^T 


jriT 








Ja;i:ri, . 


■ ,x„:Tn[r = xi:lTi[r,...,x„:lTn[r 


JeLx 






J(ei[- 
J(box 


r 

r]e2)nT 

.erix 


= 2:* 

= {fix f[a]{x:ln[r):\T2[r.le[ry 
where m = fix f[a]{x:ri):T2.e 

= (JcilrDrlTlJealT)' 

= JeLx ieT 



= (boxj^i^JelT)' i^T 

J(unboxe)nT = JeLx Ibl(e) G T 

= (unbox J e It)' Ibl(e) ^ T 

Jp(e)'LT = JpLT(JelT)' 

Jc'Lt = C 

J«nT = {\pir,f±^ f[a]{x:\rr[r):lr2ir.\eiry 

where 1) = (p,fix f[a]{x:Ti):T2.e} 



= ilvnr-lrir) 



i^r 



\pir 



\xi:ti = vi-' 



. Xn •'^n — ■ "^n 



\M[j 



a;i:jTllT = \V1^^ Lt,- •• ,Xn-lTnir = JWn^" Lx 



J(p,e)lT = (JpLrJetT) 



Figure 8. Unboxing 



4.1 The unboxing optimization 

We specify a particular choice of unboxing via an unboxing set T 
which contains the set of labels of terms and types to be unboxed. 
A choice of a particular T then induces an unboxing function as de- 
fined in Figure 8. The unboxing function is defined in a straightfor- 
ward compositional manner. Box introductions are dropped when 
their labels are in the unboxing set, box type constructors are 
dropped when their labels are in the unboxing set, box eliminations 
are dropped when the labels of their arguments are in the unboxing 
set, and all other terms and types are left unchanged. 

An important observation about the unboxing optimization as 
we have defined it is unlike many previous interprocedural ap- 
proaches (Section 7), it only improves programs and never intro- 
duces instructions or allocation. This is easy to see, since the un- 
boxing function only removes boxes (which allocate and have an 
instruction cost), and unboxes (which have an instruction cost) and 
never introduces any new operations at all. 

4.2 Acceptable unboxings 

While any choice of T defines an unboxing, not every unboxing 
set is reasonable in the sense that it defines a type and semantics 
preserving optimization. Just as we defined a notion of acceptable 
analysis in Section 3, we will define a judgement that captures suf- 



ficient conditions for ensuring correctness of an unboxing, without 
specifying a particular method of choosing such an unboxing. By 
using analyses of different precisions or choosing different opti- 
mization strategies we may end up with quite different choices of 
unboxings; however, so long as they satisfy our notion of accept- 
ability we can be sure that they will preserve correctness. 

Informally, a choice of an unboxing set is reasonable if it meets 
two criteria. Firstly, it must make uniform choices in the sense 
that if a box introduction is eliminated, then all of the types and 
elimination forms to which it flows must also be unboxed, and 
vice versa. Secondly, we must ensure that types remain consistent 
with their uses in polymorphic instantiations, since we do not allow 
polymorphism over base types. 

T 

We use the notation i ~ j to indicate when an unboxing agrees 
at two labels i and j. 



T 



iff either i,j £ T or i,j^T 



The first requirement is then specified via the cache consistency 
judgement, which enforces that for any label i, the unboxing set 
must agree on i and the labels of any shapes in the cache of i. 



Vi, s : s G C{i) 



i ~ Ibl(s) 



Ch T 

The second requirement is specified via the consistent unboxing 
judgement of Figure 9. The type rules determine the traceability of 
the unboxed type: that is, the judgement T h r : i indicates that 
unboxing r with T will result in a type of traceability t. The key use 
of the type judgement is in the term level polymorphic instantiation 
rule, which requires that the traceability of the unboxed type be r. 

4.3 Type Preservation 

Our goal is to show that the unboxing function induced by any 
acceptable unboxing is in some sense correct as an optimization. 
The first part of this is to show that unboxing preserves typing. One 
key property is that types have non-empty caches. 

Lemma 17 (Type Inhabitance) 

Ifr is not a type variable and C; q\- t then C(lbl(r)) 7^ 0. 

Proof: The proof is by inspection of the rules for acceptability. ■ 
We also need several technical properties: labels agree when 
their caches intersect, unboxing preserves type well formedness, 
type traceability, and type equality, and unboxing commutes with 
type subsitution. 

Lemma 18 (Agreement) 

If C I- T and C(i) n C(j) / tiien i ~ j. 

Proof: The proof is by inspection of the rules for cache consis- 
tency. ■ 

Lemma 19 

If A hr wf then A h Jr tr wf. 

Proof: The proof is a straight forward induction on the structure 
of r. ■ 

Lemma 20 

If C; g\-T2and C(lbl(r2)) = C{g{a)) then: 

• IfC;Q\-Ti then J n [t2 /a] [r = \ti tr [J T2 Lt /"] • 

• IfC;£il- e then\e[r2/a]i-c = JetT[jT"2 Lt/"]- 

Proof: 
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Thr-.t 



T h a* : r T h B' ; b 

T h (Va.Ti — > r2y : r 
J e T Thr : t i (^ T 



The 



T h box(r)' : f T h box(r)' : r 



The 



Tha;' 



T h (fix f[a]{x:ri):r2.ey 

Thei Thez Thr:r 
Th(ei[r]e2)' 

The The 



T h (boxr e)' T h (unbox e) 

Thp The 



T h p{ey 
T hp The 



The* 



Th«" 



Th (p,fix f[a]{x:Tiy.T2.ey Th {v'-.r)' 



Thp 



ThM 



VI < j < n : T h i;/^ 

T h Xl-.Tl = Wl'l,. . . ,Xn-Tn = «„'" 



Thp The 
Th(p,e) 



Figure 9. Consistent unboxing 



The proof is by induction on the structure of ri . Consider the 
cases for ri : 

■ Case 1, ri = a^: If T2 = cr-' and J(t-'|,-i- = a' then 
Ti[T2/a] = o-', thus Jri[r2/Q]t-r = \(t'[-^, and also 
J ''"1 It [J ''"2 It/q^] = ''"'*• Thus I need to show that J a* tf — 
cr". When a is not a box type, this condition follows easily 
from the definitions. When cr is a box type, this condition 

T 

follows if i ~ j. By C; £> h T2 and Lemma 17, C{j) 7^ 0. 
By C;g h n, C(lbl(r2)) = C{g{a)), and the rules for 

T 

acceptability, C(i) = C(j). By Lemma 18, i ~ j, as 
required. 

■ Case 2, ri = /?' and a ^ 13: In this case ri[r2/Q] = ri, 
J n It = ''"1' and the result is immediate. 



■ Case 3, ri = {Wa' .T3 —^ r4)*: Then C; g h ri requires 
C; g h rs and C; £> h r4. By the induction hypothe- 
sis, J r3[r2 /a] [t = lT3[r[lT2ir/a] and Jr4[r2/a] [t = 
lT4[r[lT2ir/a]- Thus: 

lTi[T2/a][r 
= J(Va'.r3[r2/cv] ^ r3[r2/a])'Lx 
= {\/a'.\T3[T2/a]ir ^ \Ts[T2/a][ry 

= (Va'.jT3lT[J-r2lT/a] ^ J 7-4 It [J 7-2 Lt /a])' 
= {\/a'.\T3[r^lr4iry[\T2ir/a] 

= JnlT[jT"2LT/a] 

■ Case 4, ri = box(r)': Then C; g h ti requires C; g \- t. 

The induction hypothesis is J t[t2/q] tr = J''"It[J''"2 iT/o^l- 
If j £ T then Jritx = J'^It and Jri[T2/a] i-f = 
Jr[r2/Q] |,-r> as required. If i ^ T then: 

J 7-1(7-2/0] It 
= Jbox(r[T2/a])*tT 
= hox{lT[T2/a]lry 
= box(Jrt-f[jT-2lT/a])' 
= box(Jrtx)'[J-r2lT/a] 
= JnlT[jT-2lT/a] 

The proof is a straight forward induction on the structure of e. 



Lemma 21 

If Th r : tti]enir(JrLx) hi- 
proof: The proof is a straight forward induction on the derivation 
ofThr:i. ■ 

Lemma 22 

If h Ti = r2, C h T, C;g h ri, C;£i h T2, and either 
C(lbl(ri)) C C(lbl(r2)) or C(lbl(r2)) C C(lbl(ri)) then h 

JtiLt = Jt2 1t- 

Proof: The proof is by induction on the derivation of h n — T2. 
Consider the last rule used (in the same order as the figure): 

• (Type variable) In this case ri = a' and T2 — a-' . By definition, 
J Ti It = 7"! and J r2 I,-,- = T2, and the result is immediate. 

• (Base) In this case ri = B' and T2 = B-' ■ By definition, 
J 7"! It = 7"! and J r2 (-f = T2, and the result is immediate. 

• (Function) In this case: 



n = (Va.Tii 
T2 — (Va.r2i 
h m = T21 
hri2 



T12) 
T22) 



T22 



(1) 
(2) 



WLOG, assume C(ii) 
ability: 



^ C(i2) (3). By the rules for accept- 



C; £1 h m 

C;gh Ti2 

(V£.(Q).lbl(rii 

C; £1 h r2i 

C;gh T22 

/ttnC (12, e(a), lbl(r2i), lbl(r22)) 



lbl(ri2))^ G C(ii) 



(4) 
(5) 
(6) 
(7) 
(8) 
(9) 



By (6), (3), and (9), C(lbl(r2i)) C C(lbl(rii)) (10) and 
C(lbl(ri2)) C C(lbl(r22)) (11). By (1), (4), (7), (10), and the 
induction hypothesis, h Jth [-,- = Jr2i [r (12). By (2), (5), (8), 
(II), and the induction hypothesis, h jTi2iT = J ''"22 It (13). 
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By (12), (13), and the typing rules: 

h (VQ.Jrnlx ^JtwIt)'' = (Va.Jraib ^Jraalr)'' 
By definition: 

h J (Va.rii ^ TriT^ W = J (Va.rai ^ raa)'' Lt 

as required. 

(Box) In this case ri = box(ri)*^, T2 = box(r2)'^, and 
h t( = T;^ (1). WLOG, assume C(ii) C 0(^2) (2). By the 
rules for acceptability, C; g h t[ (3), (boxlbl(ri))^ G C(ii) 
(4), C; e h Ta (5), and 6oaC (12, lbI(T2)) (6). By (4), (2), and 
(6), C(lbl(TO) C C(lbl(r^)) (7). By (1), (3), (5), (7), and the 
induction hypothesis, h Jri|,-f = Jrat-f (8). By Lemmas 17 

T 

and 18, ii ~ 12- There are two cases: 

■ Case 1, Ji G T: In this case, Jntx = J'^iIt. It^W = 
J r2 1,-f > and the result is (8). 

■ Case 2, 12 ^ T: In this case, Jri|,-f = box(Jr{ I,-,-)*"', 
J T"2 Lt ~ box(J T2 [rY^ , and the result follows from (8) and 
the typing rules. 



Now we can prove that unboxing preserves typing. 



Theorem 1 (Consistent unboxings preserve typing) 

If C h T then: 

• If A;r h e : t, C; g h T, C; g h e, and T h e then 
A;jrt-f hjet-f :JrlT. 

• If\-p:r,C;g\-p,andr\-pthenhlpir : jrtr- 

• If\- M : T,C;gh M,andrh M thenh \M[r : Jrlf 



Proof: The proof is by induction on the structure of the typing 
judgement. Consider the cases, in the same order as the figure, for 
the last rule used in the derivation: 

• (Variable) In this case e = x' and x:t G F. Then Je|,-f = x^ 
and clearly x:\tI^ G J F [-f . so the result follows by the typing 
rules. 

• (Fix expression) In this case e = (fix f[a]{x:Ti):T2.e'y. 
The typing rule requires that both r — (Va.ri — )> r2)* and 
A;F, /:r, a;:ri h e' : T2. The assumption C; £< h e requires 
Oif) = i q{^) = Ibl(ri), C h r, and C; g h e'. From C h r 
and the rules for acceptability, C h ri . From these facts, C; g\- 
r, f-.T, x:ti. The assumption The requires that T h e'. By 
the induction hypothesis. A; JF, /:t, x:ri |,-f l~ Je'|,-f : Jr2LY- 
Since: 

jr,/:r, a;:ritx = JFLt, /:(Va.Jri Lt ^ lr2iry ,^-lT2ir 
by the typing rules: 

A;JFlx ^ (fix /[Ql(2::JriLT):Jr2lT.Je'lT)' : 

The result follows since: 

JeL-f = (fix f[a]{x:lriir):lr2ir-le'lrT 
Jrt-f = (Va.Jnt-f -> lT2iry 



(Application) In this case, e — (ei[r'] 62)*. The typing rule, 
C; q\- e, and The require that: 



A;rhei : (^a.n ^ Tif 


(1) 


T^T,[T'/a] 




A; F h e2 : r2 


(2) 


A \- T wf 


(3) 


hTi[r7Q]=r2 


(4) 


C;5hei 


(5) 


C;ghT' 


(6) 


C;ghe2 


(7) 



/™C(lbl(ei), Ibl(r), lbl(e2), J) (8) 

T h ei (10) 

T h e2 (11) 

T h r' : r (12) 

for some n, T3, j, and T2. By (1), (5), (10), (2), (7), (11), and 
the induction hypothesis: 

A;jrLThJeilT:J(Va.ri^r)nT (13) 
A;jrLThJe2LT:Jr2lT (14) 

By definition J(Va.ri -> t^Y [^ = (Va.Jnl,-,- -> Jrat-f)^. 
By (3) and Lemma 19, A h Jr'tx wf (15). By (12) and 
Lemma 21, ir(Jr'tx) = r (16). By (1), (2), and Lemma 16: 

C(j) C C(lbl(ei)) (17) 

C; f> h {Va.n -^ rsY (18) 

C(lbl(r2)) C C(lbl(e2)) (19) 

C h r2 (20) 

By (18) and the rules for acceptability: 

C; £. h n (21) 

C; £- h rs (22) 

(V^(a).lbl(ri) ^ lbl(r3))f G C{j) (23) 

By (23), (17), and (8), C(lbl(r')) = C{g{a)) (24) and 
C(lbl(e2)) C C(lbl(ri)). Hence by (19), C(lbl(r2)) C 
C(lbl(ri)). Since lbl(ri[r7Q]) = Ibl(ri), C(lbl(r2)) C 
C(lbl(Ti [t'/q])) (25). By (21), (6), (24), and Lemma 13, 
C; £> h ri[T'/a] (26). By (4), (26), (20), (25), and Lemma 22, 
h Jri[r7a]tT = \T2ir- By (21), (6), (24), and Lemma 20, 
^JnlT[jT'lT/a]=Jr2LT(27).Thusby(13),(14),(15),(16), 

(27), and the typing rules. A; J FtT h (Jei Lt[Jt'Lt1 J 62 It)' : 
jT"3lT[jT'lT/a]- By definition, (22), (6), (24), and Lemma 20, 
A;jr|.-r h J(ei[r'] e2)'LT : J^Lt, as required. 

(Box expression) In this case, e = (box,-// e'Y for some e' and 
i. The typing rule requires that r = box(r")*, A h r" wf, 
A;F h e' : r', and h r" = r' for some r'. The assumption 
C; g \- e requires that C h r and C; g \- e'. The assumption 
The requires that T h e'. By the induction hypothesis, 
A;jr|,-|- h Je'^T : J t'1,t- There are two subcases: 

■ If i G T then J e L-r — J e' |,t and \T[y — J r' Lt and the 
result is immediate. 

■Ifi ^ T then Jetx = (boxj^//^^ Je't-f )' and Jr L-f = 
box(jT'|,T)*- The result follows by the typing rules if 
A h Jr'|,T wf, which holds by Lemma 19, and h 
Jt"Lt ~ J''"'It' which holds by Lemma 22 if its other 
three premises hold. Since C h r, by the rules for ac- 
ceptability, boxC{i,lhl{T")) (1) and C h r", showing 
the first premise. Since A; F h e' : r', by Lemma 16, 
C(lbl(r')) C C(lbl(e')) (2) and C h r', showing the 
second premise. By C; g h e, (boxj^j^/j Ibl(e'))"' G 
C(i). Thus by (1), C(lbl(e')) C C(lbl(r")), so by (2), 
C(lbl(r')) C C(lbl(r")), showing the third premise, as 
required. 
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• (Unbox) In this case, e — (unbox e'Y for some e' and i. The 
typing rule requires that A;r h e' : box(r)-' for some j. 
The assumption C; q \- e requires C; q \- e'. The assumption 
The requires T h e'. By the induction hypothesis, A; J F t-,- h 
Je'tT : Jbox(r)^tT-By Lemma 16, C(/) C C(lbl(e')). By 

T 

Lemmas 17 and 18, j ~ Ibl(e'). There are two subcases: 

■ If j e T then Je|,-f = Je'L-f and Jbox(T)^ Lx = JtIt and 
the result is immediate. 

■Ifj ^ Tthenjet-r = (unboxje't-f )' and Jbox(r)^' tx = 
box(J tIyY ■ The result then follows by the typing rules. 

• (Frame) In this case, e = p'{e'y for some p', e', and i. The 
typing rule requires that h p' : F' and A; F' h e' ; r for 
some r'. The assumption C; £> h e requires that C; g \- p' and 
C; q\- e'. The former requires that C; g\- F'. The assumption 
The requires The'. By the induction hypothesis, \- Ip'ir : 
ir'ir and A;JF't-f h Je'L-f : Jrtx- So by the typing rules, 
^;jrtT l~ Jp'lT(Je'LT)' : J rLx- The resuU follows since 
Jet-f = Jp'LT(Je'lT)'- 

• (Constant) In this case e — c' for some c and i. The typing rule 
requires that r = B*. Clearly, \ely — c*, Jr|,x — B', and the 
result follows by the typing rules. 

• (Fix value) In this case e = (p, fix f[a]{x:Ti):T2.e'y. The 
typing rule require that r — (Va.ri — >■ r2)\ h p ; F', and 
A, a; F', f-.T, x:t\ \- e' : T2. The assumption C\ q\- e requires 
C; £» h p, from which C; £> h F', (?(/) = i, q{x) = Ibl(ri), 
C h r, and C; g \- e . From C h r and the rules for 
acceptability, C h ri. From these facts, C; £> h F', f:T,x:Ti. 
The assumption The requires T h e'. By the induction 
hypothesis, h JpL-r : JF'L-f and A,a; JF', /:r, a;:ri |,-f h 
Je't-f : Jr2tx- Since: 

\r',f:r,x:n[r = \r'[r,f:{\fa.lnir^\r2[ry,x:lT2ir 
by the typing rules: 

A;JFlx ^ (JpLx.fix f[a]{x:lriir):lT2lr.le'lry : 
(Va.Jritx ^ Jt-2Lt)' 
The result follows since: 

JeLx = (JplT,fix /H(a:.-:J-riLT):jT-2lT-Je'lT>' 
Jrt-f = (Va.JriLx ^ Jt-2Lt)' 

• (Box value) In this case, e — («-' :r") for some v, i, and j. The 
typing rule requires that r — box(r")\ A h r" u/. A; F h 
u-* : r', and h r" = r' for some r'. The assumption C; £> h e 
requires that C h r and C; q h ti-*. The assumption The 
requires T h u-'. By the induction hypothesis, A;JF|,-f h 
Iv-' [y : \t'[y- There are two subcases: 

■ If i G T then J e [t = J ^"^ Lt and J r |,-r = J r' [r and the 
result is immediate. 

■Ifi ^ T then JcLt = {Ji;nT:J^"lT>' and Jrtx = 
box(Jr'|,-|-)*. The result follows by the typing rules if 
A h Jr"t-|~ wf, which holds by Lemma 19, and h 
J''"" It — J''"'It' which holds by Lemma 22 if its other three 
premises hold. Since C h r, by the rules for acceptability, 
boxC{i, lbl(r")) (1) and C h r", showing the first premise. 
Since A;r h v^ : r', by Lemma 16, C(lbl(r')) C C(j) 
(2) and C h r', showing the second premise. By C; g \- e, 
{hoxt,^r')j)': e C(i). Thus by (1), C(j) C C(lbl(r")), 
so by (2), C(lbl(r')) C C(lbl(r")), showing the third 
premise, as required. 



(Environment) In this case p = xi:ti — wi* 



• , Xn -Tn 



«„'" and F = xi:ti, . . . ,Xn'-Tn- The typing rule requires that 
0; h Wj*J : Tj and h tj = rj for 1 < j < n and some rjs. 
The assumption C; g \- p requires C; q h r, and C; g h Vj^^ 
for 1 < J < n. Clearly C; g h F' where F' is empty. The 
assumption T h p requires T h Vj^^ for 1 < j < n. By 
the induction hypotheis, 0; h Iv/J [^ ■ JtJLt for 1 < 
j < n. By the rules for acceptability, C(ij) C C(lbl(Tj)) 
for 1 < i < n. By Lemma 16, C(tJ) C C(ij) and C h rj 
for 1 < j < n. Thus C(rj) C C{tj) for I < j < n. 
By Lemma 22, h \Tj[y — J rj |,-|- for 1 < j < n. Then 
by the typing rules h a::i: J TiL-f = Jwi'Hti • • ■ ,3;„:Jr„ Lx = 
Jfn'"LT : a;i:JriLx,- • •.,a:^n:J'r„L-f The resuU follows since 
Ipir = a;i;Jri|,-f = J«i'i It, • • • ,Xn:\Tn[r = J«n'" It and 
JTIt = a:^l:JnLT,•••,a:^n:Jr„t-f• 
(Machine state) In this case M — (p, e). By the typing rule, 
h p : F and 0; F h e : r for some F. The assumption C; g\- M 
requires both C; £> h p and C; g h e. The former requires 
C; g \- r. The assumption T h M requires T h p and The. 



By the induction hypothesis, h \pl 



JFt-f and 



JsIt : J rtx- So by the typing rules, h (JplT,JeLx) : J^Lt- 
The result follows since J Af |,x = (J P It , J c Lt ) • 

■ 
A consequence of type preservation is that unboxed well typed 
programs are traceable. 

Theorem 2 

If h M : r, C h T, and C;g\- M then \- \M[r tr. 

Proof: The proof follows from Theorem 1 and Lemma 11. ■ 

4.4 Coherence 

The other part of proving correctness is to show that unboxing 
preserves semantics in some appropriate sense. That requires two 
key lemmas — that a step of the program can be matched by zero 
or more steps of the unboxed program and that consistency is 
preserved under reduction. 

To show the first lemma, we need three technical lemmas — that 
a value's cache is nonempty, that reduction preserves the unboxing 
or not of the outermost label, and a multistep compositionality 
property. 

Lemma 23 (Inhabitance) 

IfC; g\- v'' then 3s G C(fc) such thatlhl{s) = k. 

Proof: By inspection of the acceptable analysis and acceptable 
instantiation rules. ■ 



Lemma 24 (Unboxing set preservation) 

If C;e h p, C;e h e, C h T, and (p,ei) 

Ibl(e) ~ Ibl(e'). 



(p, e2) then 



Proof: All of the cases for which Ibl(ei) — lbl(e2) follow 
immediately. For the remaining cases: 

• If (p, x*) I — > {p,v-') where x:t — v-' £ p then by the as- 
sumptions we have that q{x) = Ibl(r) (I), C(j) C C(lbl(r)) 
(2) and C[g{x)) C C{k) (3), so by transitivity we have C(j) C 
C(fc) (4). By Inhabitance (Lemma 23) we have an s £ C(j) (5) 
such that Ibl(s) = j (6), and so by Agreement (Lemma 18) 

T 

we have k ~ j. Since Ibl(e) — k and Ibl(e') = j, the result 
follows. 
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If (p, (unbox {v^:t) ) ) i — > (p, f*) then we must show that 

T 

fe ~ i. By Inhabitance we have s £ C(i) with Ibl(s) = i, so by 
Agreement, it suffices to show that C(i) C C(fe). By the box 
rule for an acceptable analysis, there is a s = (boxt 1)1 £ C(j) 
such that C(i) C C(/). Since s £ C(j), by the rule for unbox, 
C(0 C C(fc), so C(i) C C(fc) and we're done. 

If (/9, p'{v'^y ) I — > (p, «*) then we must show that j ~ i. By 
Inhabitance, there is an s £ C(i), and by the acceptable analysis 
rule for frames we have that C(i) C C(j), so by Agreement we 

T 

have that j — i. 



Lemma 25 (Many step composltlonallty) 

If(p, ei) I — >* (p, 62) then: 

.(p,(eier)^*(p,(e2ey) 

• {P,i-"^ ei)') K->* (p, (i>J 62)') 

• (p, (boxr ei)*) I — >* (p, (boxr 62)') 

• (p, (unbox ei)') 1 — >* (p, (unbox 62)') 

• (P,p'(ei)') I — >* (p,p'(e2)') 

Proof: The proof is by an easy induction on the length of the 
reduction sequences. ■ 

Theorem 3 (Single step reduction coherence) 

If\- M : T,C;g\- M,C\- T,T \- M, and M < — > M' then 

\M[r^^* IM'ir. 

Proof: The proof is by induction on the derivation of M 1 — >■ M' , 
consider the cases for the last rule used to derive it: 

• If (p, X ) I — 7- (p, V-') where x:r — v-' £ p then by definition 

IMir = {lplr,x%lM'lr = {Ipir, Iv' Irh ^nd x:lrlr = 
J W"* Lt G j P Lt ■ Thus J M [y I — > J M' [^ by the same rule. 

• If: 

(P, (fix f[a]{x:ri):r2.eiy) ^ — > 
(p,(p, fix f[a]{x:Ti):r2.eiy) 

then the unboxings of the e and e' are of the same form, and the 
same reduction step applies. 

• If (p, (box^/ Vt'Y) I — > (p, {vt'.T'y) where tr{T') = i then: 

■ If j ^ T then: 
By the definition of unboxing, \el^ — (boxj^-'i v^, ) 
where tij, ^Jut'Lx- 

By hypothesis, h p ; T, ; T h ut* : r" and h r' = r" for 
somer". By hypothesis, T h Wt'. By Theorem 1, 0; JFLy h 
K' -J '''" Lt ■ The proof of that theorem also showed that h 
J^'Lt = Jr"tT,sobyLemma4, ir(Jr'lT) = tr{lT"ir)- 
By Lemma 6, ir(Jr"t-f) = *', so ir(Jr'Lx) = *'■ 



By definition of reduction (JpL-r, (boxj^'i v'^, ) ) 

(Jpl^,(«;,'=:Jr'lT>'). 
■ If j £ T then: 

By definition of unboxing 

Jetx = v'^, where Uj, ^ \vt''[r 
By definition of reduction 

(JpIt,^^')^*(JpLt,^^''=) 

If (p, (ei[r'l 62)^) ^ (p, (e'i[r'] 62)^) then: 



By definition of C; £> h M we have that C; () h p and C; () h ei. 
Hence we have that C; q\- (p, ei). By the typing rules we also 
have that h p : P and 0; F h ei : ri for some F and ti, 
so \- (p, ei) : ri. By the rules for consistency, T h p and 
T h ei, so T h (p, ei). Hence by induction we have that 
(JplTJeitx)' — >* (JplrJe'iLT)- 
By Lemma 25 

(JpLT,(JeilT[Jr'LTlJe2LT)')^* 
(JpLT,(Je'ilT[J^'LTlJe2LT)') 
By definition of unboxing 

(JplT,J(ei[r'l e2)nT) ^* (JpLT,J(e'iM e2)nT) 

If (p, (ei[T'] C2)"') I — > (p, (ei[r'] 62)"') then the argument 
follows by the symmetric argument to the previous case. 



If (p,(V[^V*'=)') ^ (p,p"(e"[r'/al)') 



where: 



Vf = (p ,fix f[a]{x:Ti):T2.e ) 
p" = p'J-.T = Vf^,x:Ti = Vt'' 



T - 
t 



(Va.ri — ) 



T2) 



then: 

By definition of unboxing we have that: 

v'f = (Jp't-f,fix /[a](a:::JriL-f) 



\T-i\l 






J^t'LT 



\p'ir,f-lTir = 
{le"lr'/a]Wy 



v'/, x:\Tiir 



Vf 



By hypothesis, h p : F, 0; F h ut : r", and I- t[ — r". 
By Theorem 1, 0;jrtT- h u^/*'' : Jrl't,-- The proof of that 
theorem also showed that h Jri [-r — Iri'ly. By Lemma 4, 
trilriir) = trQri'ir). By Lemma 6, tr{\Ti'[r) = t' . Thus 
tr{\ tiIy) ~ ^' ■ So by the application beta rule: 



(JplT,Jet-f) ' — > (JpLt,P (e ) ) 



where: 



p'" 


= 


lp[-r,f:r" = vy,x:ri'-- 


= Vf 


t" 


= 


(VQ.JriLT^Jr2lT)' 




ri' 


= 


Iriirllr'ir/a] 




e'" 


= 


\e"iTl\r'ir/a] 





By definition of unboxing, r" = Jr^-f- The proof of the 
theorem above also showed that C; g \- n, C; g \- r', and 
C(lbl(r')) — C{g{a)). By the rules for acceptability, clearly 
C;g\- e". By Lemma 20, n = Jntr ande'" = le"[T'/a][r. 
Putting that altogether, p"'(e"') — J e' I,-,-, as required. 

If (p, (box^/ e)-') I — !■ (p, (box^/ e')"*) then: By definition of 
acceptability C; g \- p and C; g h e, so C; £> h (p, e). By 
the typing rules, h p ; F and 0; F h e : r" for some F and 
r", so h (p, e) : r". By the rules for consistency, T h p 
and T h e, so T h (p, e). So by the induction hypothesis, 
(JplT,Jet-f) ' — >* (JpLrJe'tr)- 

■ If j £ Tthen: 

By definition of unboxing 

J (box^/ e)^ t-r = Jet-f 
By definition of unboxing 

J(box,,e')nT = Je'LT 
By induction 

{lpir,leir) ' — >* {lplr,le'ir) 
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If j ^ T then: 

By definition of unboxing 

J(box,,e)nT 
It 



J(box,,e')ni 



(boxj,,i^Je'L-f)' 

Hence by the induction hypothesis and Lemma 25 we have 
that: 

(Jpt-f,(boxj,,t^Je'LT)') 
(p, (unbox e'"*^"* 



If (p, (unbox e) 



then let i — Ibl(e) and 



i' = Ibl(e'). By Lemma 24 we have that i ~ i' . By the rules 
for acceptability, C; g \- p and C; g h e, so C; £< h (p, e). 
By the typing rules, h p : F and 0; F h e : r' for some F 
and t', so h (p, e) : r'. By the rules for consistency, T h p 
and T h e, so T h (p, e). So by the induction hypothesis, 
(JpLrJetT) I — >* (JplxJe't-r)- 
■If j,i' G Tthen: 

By definition of unboxing 

J (unbox e)^ Lt = JcIt 
By definition of unboxing 

J (unbox e')^' It =Je'lT 
By induction 

(JplT,JeLT)^*apLT,Je'lT) 
■If j,i' ^ Tthen: 

By definition of unboxing 

J (unbox ey [^ = (unbox J e L^)^ 
By definition of unboxing 

J (unbox e')^ tx = (unboxJe'tT^ 
By induction 

(JplT,JelT)^*(JpLT,Je'lT) 
By Lemma 25 

(Jpt-f,J(unboxe)^t-f) ' — >* 
(Jpt-f,J(unboxe')'lT) 

■ k 

If (p, (unbox {v^-.tY) ) I — y (p, v^) then: 
■If j e Tthen: 

By definition of unboxing 

J (unbox {v*:Tyf [r = J {v'-.tY Lt = J«' It 
So in zero steps 

(Jpb,J(unbox(t;":r)^)'b) ^* (lpir,\v'lr) 
■ If j ^ T then: 

By definition of unboxing 

J(unbox (u':r) ) [^ = [unbox \ {v' -.t) [y) 
By definition of unboxing 

(unboxJ(u':r) [^) = (unbox (Ju' |,t:J''"It) ) 
By definition of reduction 

(JpLt, (unbox (Ji;nT:J^lT>'y)^ 
(JpLxJ^nx) 

If (p,p'(ei)') I — y (p,p'(e2)') then: 

By the rules for acceptability, C; g \- p' and C; g h ei, so 
C;g\- (p, ei). By the typing rules, h p' : F' and 0; F' h ei : r 
for some F', so h (p',ei) : r. By the rules for consistency, 
T h p' and T h ei, so T h (p',ei). So by the induction 
hypothesis, (Jp'It, Jei tf ) ' — >* (Jp'Lt, Je2lT)- 



By definition of unboxing 

Jp'(ei)nT=Jp'lT(JeilTr 
By definition of unboxing 

Jp'(e2nT=Jp'lT(Je2lTr 
By induction 

(Jp'LT.JeilT) I — >* (Jp'LT,Je2lT) 
By Lemma 25 

(JplT,Jp'LT(JeilT)') ^* (JplT,Jp'lT(Je2lTr) 

.If(p,p'(w')')^(p,«')then: 
By definition of unboxing 

Unboxed value is a value, so by reduction rules 

(JplT,Jp'lT(J«nT)')^^(JplT,J^nT) 

■ 
To show preservation of consistency we need a type substitution 
lemma. 

Lemma 26 

If C h T, T h r : r, r is not a type variable, and C(lbl(r)) = 
C{g{a)) then: 

• If T h r' : r and C;g^r' then T h T'[r/a\ : r. 

• If T h e and C; £1 h e then T h e[r/a]. 

• If T h p and C; £1 h p tiien T h p[T/a]. 

Proof: The proof is by simultaneous induction on the derivation 
of T h r' : f , T h e, and T h p. The cases for expressions and 
environments are straight forward. Consider the cases for types: 

• Case 1, r' — a^ If r = a^ then T'[T/a\ = a*. Since 
C; £> h r', C(i) = C(g(a)), so C(i) = C(j). By Lemmas 17 

T 

and 18, i ~ j. Then by inspection of the rules, T h a' : r as 
T h o-^ ; r. 

• Case 2, r' = /?* and a ^ /?: Then t'[t /a] — r' and the result 
is immediate. 

• Case 3, r' = B*: Then T h r' : r is not possible. 

• Case 4, r' = (V^.n -^ ra)': Then T h t'It/o] : r, as 
required. 

• Case 5, r' = (box(r"))\ i G T: By the rule, T h r" : r. 
Assumption C; g h r' requires C; g h r". By the induc- 
tion hypothesis, T h T"[T/a] : r. By the consistency rules, 
T h box(r"[r/Q])' : r. By definition of substitution, T h 
box(r")'[r/a] : r, as required. 

• Case 6, r' = (box(r"))\ i ^ T: Then T h t'It/o] : r, as 
required. 



Lemma 27 

If T \- Mi,C\-r,C;g\- Mi, h Mi : r, and Mi 
TI-M2. 



Mo then 



Proof: The proof is by induction on the derivation of Mi 1 — > M2 . 
Let Ml — (p, ei) and M2 = (p, 62). By the rules for consistency, 
T h p and T h ei. The result follows if T h 62. The typing rules 
require h p : F and 0; F h ei : r' for some F and r'. Assumption 
C; g\- M requires C; g\- p and C; g\- ei. Consider the cases for 
the last rule used (in the same order as the figure): 

• (Variable) In this case: ei — x^, 62 = v-' , and x:t' — v-' £ p. 
By T h p, T h I'-' , as required. 
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• (Fix expression) In this case: ei = (fix /[a](a;:ri):T2.e)' and 
62 = (p, fix f[a]{x:Ti):T2.ey . Then T h ei requires The, 
then since T h p, T h e2, as required. 

• (Box expression) In this case: ei — (boxrW*) and 62 = 
(v^-.r) . Then T h ei requires T h w\ so T h 62, as required. 

• (Application left) In this case, we have ei = (e3[r] 64)', 
62 ~ (e-blr] 64)', and (p, 63) 1 — > (PjCs) is a subderivation. 
Then T h ei requires T h 63, T h 64, and T h r : r; 
0; r h ei : r' requires 0; F h 63 : r" for some r"; C; (? h ei 
requires C; £> h 63. By the induction hypothesis, T h eg, so by 
the consistency rules, T h 62, as required. 

• (Application right) In this case: ei — (63 [r] 64)', 62 = 
(e3[r] 65)*, and (p, 63) 1 — > (p, 65) is a subderivation. Then 
T h ei requires T h 63, T h 64, and T h r ; r; 0; F h ei ; r' 
requires 0; F h 64 : r" for some r"; C; £> h ei requires 
C; Q \- 64. By the induction hypothesis, T h 65, so by the 
consistency rules, T h 62, as required. 

• (Application beta) In this case: 

61 = (ui'[r] V2^) 

VI = (p',fix f[a]{x:Ti):T2.e) 

62 = p"(6[r/«])'= 

p" = p', /:t' = «i', a;:r{ = V2^ 
t' = (Va.ri — !> r2)* 
Ti = n [r/a] 

By T h 61 and the rules for consistency, T h wi% T h p', 
T h 6, T h r : r, and T h U2-'. Thus by the rules for 
consistency, T h p". By the typing rule h r' w/, so r' 
cannot be a type variable. Assumption C; q \- M requires 
C; Q \- e and, as in previous proofs, C(lbl(r)) = C((j(i)). By 
Lemma 26, T h 6[r/a]. By the rules for consistency, T h 62, 
as required. 

• (Under box) Similar to application left. 

• (Under unbox) Similar to appliction left. 

• (Unbox beta) In this case: 61 = (unbox (u':r)"') and 62 = u'. 
Then T h 61 requires T h w*, as required. 

• (Under frame) Similar to application left. 

• (Frame return) In this case: 61 = p'(u')"' and 62 = «'. Then 
T h 61 requires T h w*, as required. 

■ 
With these lemmas we can prove our semantics preservation 
result. 

Theorem 4 (Coherence) 

• If h M : r, C; £1 h M, C h T, T I- M, and M 1 — >* (p, «') 

thenJMtr < — >* (JpItJ«Ht)- 

• If h M : r, C; e h M, C h T, T F M, and M I — > ■ ■ ■ then 

Proof: 

• By induction on reduction derivations, using Theorem 3. 

1. If M I — >■* {p,v^) in zero steps, then the result follows 
immediately. 

2. If M I — >* (p, «') in n steps, then by definition, Af 1 — > 
M' and M' 1 — >* (p, u') in n - 1 steps. 

By Theorem 3 



By Theorem 1 

h M' : r' 

By Lemma 15 

C;g\- M' 

By Lemma 27 

ThM' 

By induction 

JM'Lt^-^* {lp[r,\v'W) 

By the defininition of many step reduction 
lM[r>^' {\p[r,lvnr) 
• In the operational semantics, there are six leaf reductions. Two 
of them take expression forms to value forms, but otherwise 
leave the term unchanged. One of the them takes unbox of box 
of a value to that value. One of them takes a frame of a value 
to that value. Thus if we measure a term by adding its size, 
number of lambda expressions, and number of box expressions, 
then this metric strictly decreases for these three leaf reductions. 
Therefore, in any infinite reduction sequence, there must be 
an infinite number of steps whose leaf reduction is a variable 
reduction or an application beta reduction. Then observe in the 
proof of Theorem 3 that the unboxing of a variable redex or 
of an application beta redex will always take a step, and that 
Lemma 25 preserves this. Thus the unboxing will also take an 
infinite number of steps. 

■ 
Theorem 4 shows that if two terms are related by reduction, 
then their images under the unboxing function are also related by 
the many step reduction relation given that the unboxing pair is 
acceptable; and that if a term diverges under reduction, then its 
image under the unboxing function also diverges. In other words, 
for an acceptable analysis and an acceptable unboxing, the induced 
unboxing function preserves the semantics of the original program 
up to elimination of boxes. Since the semantics of the core language 
only defines reduction steps that preserve GC safety, this theorem 
implies that the image of a GC safe program under unboxing is also 
GC safe. 

5. Construction of an acceptable unboxing 

The previous section gives a declarative specification for when an 
unboxing set T is correct but does not specify how such a set 
might be chosen. In this section we give a simple algorithm for 
constructing an acceptable unboxing given an arbitrary acceptable 
flow analysis. 

The idea behind the algorithm is that given a program and an 
acceptable flow analysis for it, we use the results of the flow anal- 
ysis to construct the connected components of the interprocedu- 
ral flow graph of the program. All of the elements of a connected 
component will then either be unboxed together, or not unboxed at 
all. Any such choice of unboxing (as we will show) satisfies the 
cache coherence property. The only remaining requirement is that 
the choice of unboxing set be consistent, which is easily satisfied 
by ensuring that any connected component which includes a type 
passed to a polymorphic function is only unboxed if the unboxing 
of the type argument still has traceability r. In the rest of the sec- 
tion, we make this informal algorithm concrete and show that the 
choice of unboxing that it produces is in fact acceptable. 

For the purposes of this section we ignore environments and the 
intermediate forms p(6), (p, fix /[a](a;:ri):r2.e)-' and (v'-.t)''. 
These constructs are present in the language solely as mechanisms 
to discuss the dynamic semantics — in this sense they can be thought 
of as intermediate terms, rather than source terms. It is straightfor- 
ward to incorporate these into the algorithm if desired. 
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Given a flow analysis (C, q) and program e such that C; p h e, 
we define the induced undirected flow graph TG as an undirected 
graph with a node for every label in C, and edges as follows: 

• For every label i and every shape s G C(i), we add an edge 
between i and Ibl(s). 

The edges simply connect up each program point with all of its 
reaching definitions. 

Given a flow graph TG, we can find the connected components 
in the usual way. Let CC be a mapping which maps labels to the 
connected component in which they occur Note that by definition 
each label occurs in exactly one connected component. It is easy to 
show that any connected component is cache consistent. 

Lemma 28 (Cache consistency of a connected component) 

Given any acceptable analysis (C, g) with induced How graph TG, 
and any connected component S ofJ^G, S is cache consistent: that 

is, C\- S. 

Proof: To show that C h 5 we must show that Vi,s ; s G 

C(i) =^ i ~ Ibl(s). But note that by the construction of 
the induced flow graph TG, whenever s G C(i) there is an edge 
between i and Ibl(s), and consequently by definition of a connected 
component, i and Ibl(s) must be in the same connected component. 
Since every label occurs in exactly one connected component, 
either both i and Ibl(s) are in S or both are not in S. By definition 

then, i ~ Ibl(s). ■ 

It is also easy to show that the union of any two disjoint cache 
consistent sets is also cache consistent. 

Lemma 29 (Cache consistency (unary) closure) 

Given any acceptable analysis (C, g) and disjoint label sets Si and 
S2, then ifC \- SiandC\- S2 then C h 5i U 5*2 

Proof: To show that C h Si U S2 we must show that Vi, s : s G 

C(i) ^=> i ~ Ibl(s). Consider an abitrary label i. If i is 
not in Si U &, then we have that i is not in Si and not in 52, and 
hence by assumption, Ibl(s) is not in Si and not in S2, and hence 
we have agreement. If i is in Si U S2 , then it must be in either Si or 

S2. WLOG, assume that i G Si . By assumption, i ~ Ibl(s), and so 
Ibl(s) G Si, and hence Ibl(s) G Si U S2 and we have agreement. 
■ 

Consequently, we can show that any set consisting of a union of 
connected components of the induced flow graph is cache consis- 
tent. 

Lemma 30 (Cache consistency closure) 

Given any acceptable analysis (C, g) with induced flow graph TG, 
and any set SS of connected components of TG, U "^^ '^ cache 
consistent. 

Proof: By Lemma 28, each connected component is cache consis- 
tent. By definition, any two connected components are disjoint, and 
so by Lemma 29 the union of any two connected components are 
cache consistent, and are disjoint from any other connected compo- 
nent. The cache consistency of IJ SS follows directly by induction. 



5.1 The algorithm 

Given the set of connected components for the induced flow graph, 
the algorithm begins with an initial unboxing set T consisting of 
the union of all of the connected components. By Lemma 30, we 
have that C h T. The algorithm then proceeds by considering in 
turn each application sub-term ei[r]e2 as follows: 



- For each sub-term of e of the form ei [r] 62 : 
- if Ibl(r) G T, and if T h r : b, then: 

- T^T-CC(lbl(r)). 

That is, for any application for which the current unboxing results 
in the type argument being unboxed to a non-reference type, we 
remove the connected component for the type from the unboxing 
set. Note that after removing a connected component from T, 
the new unboxing set T is still cache consistent since it is still a 
union of connected components (just a union of one less connected 
component). 

With the help of some technical lemmas, it is straightforward to 
show that the final unboxing set T computed by the algorithm is an 
acceptable unboxing for the program. 

To begin with, we observe that if a type's label is not in the un- 
boxing set T, then it is consistent and its traceability is unchanged 
by the unboxing. 

Lemma 31 (Type consistency) 

For any unboxing set T and type r, if Ibl(r) ^ T then T h t : 
trij). 

Proof: By inspection. 

• (Variable) ir(Q') = r, and T h a' : r. 

• (Base type) ir(B') = b, and T h B' : b. 

• (Fun type) fr(Va.ri — > r2*) = r, and T h Va.ri — > T2^ : r. 

• (Box type) fr(box(r')') = r, and by assumption i ^ T, so we 
have that T h box(r')* : r. 

■ 

It is also the case that the consistent type judgement defines a 

total function on types, and hence for any type we either have that 

it is consistent at traceability r or that it is consistent at traceability 

b. 

Lemma 32 (Type consistency Is a total function) 

For any unboxing set T and type r, either T h r : b, or T h r : r. 

Proof: By induction on types. All of the cases follow immediately 
except when r = box(T')* and i G T. In that case, by induction we 
have that either T h t' : b, or T h r' : r, and so by construction 
either T h r : b, or T h r : r. ■ 

Theorem 5 

If A; r h e : r, C; £1 h r, and C; q\- e and ifT is the unboxing set 
computed by the algorithm in this section, then T is an acceptable 
unboxing for e. That is, C h T and The. 

Proof: The conclusion that C h T follows almost immediately 
from Lemma 30. The initial choice of T is a union of connected 
components, and hence is cache consistent. At every step of the al- 
gorithm, we may remove a single connected component from T. 
The result is still a union of connected components (since con- 
nected components are disjoint), and hence the result of removing 
a connected component is still cache consistent by Lemma 30. 

The conclusion that The follows by induction on the structure 
of the typing derivation. 

• (Variable) In this case, e — x"^, consistency is immediate. 

• (Fix) In this case e = (fix /[a](a;:ri):T2.e')'. To get con- 
sistency, we must show that The'. The last rule applied in 
the typing judgement must have been the fix rule, and by its 
premises we have that A h Va.ri — > r2* wf (1), and that 
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A, a; r, /:VQ.ri — > T2'', x:ti I- e' : T2 (2). The last rule ap- 
plied in the acceptable analysis judgement must also have been 
the fix rule, and by its premises we have that C; g h e' (3). 
To apply the induction hypothesis, we need (1), (3), and that 
C; £1 I- r, /:Va.ri — ^ T2\x:ti (4). To show (4), it is sufficient 
to show that: 

■ £'(/) = * which is a premise of the acceptable analysis 
derivation 

' C; g \- Va.Ti — )• r2* which is a premise of the acceptable 
analysis derivation 

■ g{x) — Ibl(ri) which is a premise of the acceptable analy- 
sis derivation 

' C; g \- Ti which is a sub-premise of the derivation of 

C;g\- Va.Ti -> t2\ 

So by (1), (3), and (4), we have by induction that T h e'. 

(Application) In this case e = (ei [r]e2)*. To prove consistency, 
we need that T h ei (1), T h 62 (2), and T h r : r 
(3). Inverting the typing derivation and the acceptable analysis 
derivation immediately gives us the premises we need to apply 
the induction hypothesis to get (1) and (2). To prove (3), note 
that a premise of the typing derivation gives us that tr{T) — r 
(4). If Ibl(r) ^ T, then by Lemma 31 we have that T h r : 
tr(T) and so by (4) we're done. If Ibl(r) £ T, then by the 
definition of the algorithm, we must have that T h r : b does 
not hold (since otherwise the algorithm would have removed 
the connected component containing Ibl(r) from T), and so by 
Lemma 32 we must have that T h r : r and we're done. 

(Box) All of the premises need to apply the induction hypothe- 
sis are available immediately by inverting the typing derivation 
and the acceptable analysis derivation. 

(Unbox) All of the premises need to apply the induction hypoth- 
esis are available immediately by inverting the typing derivation 
and the acceptable analysis derivation. 

(Constant) Follows immediately. 



Thus we have shown by construction that the specification de- 
fined in Section 4 is a useful one in the sense that it is satisfiable. 



6. Open Terms 

The paper so far has considered whole-program optimization and 
proved that unboxing in that setting is correct. We would like to 
be able to optimize program fragments where we have part of the 
program but know nothing about the rest of the program. Such 
a setting adds one more correctness criteria — since we are not 
optimizing the rest of the program, anything that flows across the 
boundary to or from the rest of the program must remain as boxed 
as it originally was. We can ensure this requirement by simply 
requiring that nothing on the boundary is in the unboxing set. This 
section formalizes these ideas and proves them correct. 

For our purposes, a program fragment is a module, which is 
a triple (F => e : r). F specifies the imports of the module, e 
specifies the body of the module, which exports only one thing — 
the value that e evaluates to, and r specifies the type of the export. 
We wish to optimize modules without making any assumptions 
about the code that the module is linked to. In particular that means 
we cannot unbox any of the imports nor unbox anything exported. 
This requirement can be achieved by not unboxing any subterm of 
any type in F nor in r. 

We can extend the definitions of well typedness, acceptability of 
flow analysis, unboxing, and consistency of unboxing to modules. 



\- {r ^e-.r) wf 



'hV wf 0; F h e : r 



C;q\- (T^e-.T) 



C;g\-'r C;gh'e C;g\-'T 
C;^h (F^e:r) 



J(F^e:r)LT 




J(F^e:r) 


T h r not unboxed 




i^r 





i iT 



T h a' not unboxed T h B* not unboxed 

i ^ T T h n not unboxed T h r2 not unboxed 

T h (Va.ri — > T2Y not unboxed 

i ^ T T h r not unboxed 



T h box(r)* not unboxed 



T h F not unboxed 



VI < J < n : T h Tj not unboxed 

T h Xl'.Tl, 



T h (F ^ e : r) 



T h F not unboxed The T h r not unboxed 



T h (F ^ e : r) 



Figure 10. Judgements for modules 
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Figure 11. Stronger Analysis 
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and the formal judgements appear in Figure 10. The acceptability 
of a flow analysis for modules is stronger than that for programs. 
The rules for type variables, box expressions, and box values are 
replaced with those in Figure 11, the other rules remain the same. 
The rules for boxes require a more precise shape in the cache, any 
actual flow analysis would use such a shape, so this requirement 
is not a burden. The rules for programs are weaker as the stronger 
conditions are not closed under reduction whereas the weaker con- 
ditions are. The stronger condition for type variables is required to 
ensure consistency for type variables, and is also not a burden. 

The goal is to show that unboxing is correct for modules. A 
suitable notion of correctness is that a module and its unboxing 
are contextually equivalent. Rather than define contextual equiva- 
lence directly, we will use a notion that is usually proven equivalent 
to contextually equivalence as our definition. Namely, two expres- 
sions are equivalent if in any environment that closes them and any 
elimination context for their type they are observable equivalent 
then they are contextually equivalent. The formal definition is in 
Figure 12. 

The strategy is that we will take the context and alpha vary it and 
relabel it so that it is sufficiently distinct from the module. Then we 
will argue that we can modify the flow analysis and unboxing to 
cover the context without unboxing any of it. Then by coherence 
the module in context will behave the same as the unboxing of the 
module in context, which because the context is not unboxed, will 
act the same as the unboxed module in context. 

First we formalize and prove that the operational semantics is 
insensitive to the alpha variant and labels used. Let x ^^ y mean 
that X and y are alpha variants and possibly relabelled. 



If r = 

T h 



Xi:t\, . 



Tj not unboxed for 



then: T h F not unboxed requires 
1 ^ J ^ ^- So by the first item. 



Lemma 33 

If Ml ~s M2 and Mi 
M3 ~s M4 and M2 I — > 



M-i then there exists M4 such that 



Mi. 



Proof: The proof is by a straight forward induction on the deriva- 
tion of Mi I — y Mz. ■ 
Next we prove three lemmas about unboxing preservation. In 
the first two we show that something's unboxing is that something 
because either the not unboxed judgement (the first lemma) or the 
labels in the something are not in the unboxing set (the second 
lemma). In the third we show the unboxing of an expression is the 
same if the unboxing set is the same on the labels in the expression. 
To state and prove these and subsequent lemmas we need a function 
to return all the labels in an expressions, type, or environment. It is 
defined in Figure 13. 

Lemma 34 

• If T h r not unboxed then \T[-f- = r. 

• If T h r not unboxed then jr|,-f = F. 

Proof: 

• The proof is by induction on the structure of r. Consider the 
cases; 

■ Case 1, r = q': Then by definition Jr I,-,- = t, as required. 

■ Case 2, r = B' : Then by definition J r I,-,- = t, as required. 

■ Case 3, r = (Vq.ti — t- T2Y: Then T h r not unboxed 
requires T h ri not unboxed and T h r2 not unboxed. 
By the induction hypothesis, Jri |,-f = ri and \t2[-^ = T2. 
By definition, J r [x = t, as required. 

■ Case 4, r = box(r')': Then T h r not unboxed requires 
i ^ T and T h r' not unboxed. By the induction hypoth- 
esis, \T'[y = r'. By definition, \T[y = t, as required. 



J '''3 It = "^i for 1 ^ J ^ J^- Then by definition, J F [^ = F, as 
required. 



Lemma 35 

• If Ibis (p) n T = 

• If ibis(£;) n T : 



i then J p Lt = 
3 tiien J E L-f 



E. 



Proof: The proof is a straight forward induction on the structure 
of p and E. ■ 



Lemma 36 

If Ti n Ibls(e) 



T2nlbls(e) thenjetxi = Jel-, 



Proof: The proof is a staight forward induction on the structure of 
e. ■ 

Next we state and prove our main technical lemma. This lemma 
states that we can rewrite the context and flow analysis to have 
certain desirable properties, namely that the flow analysis covers 
the context and the module, that the context is not unboxed, that 
the module is unboxed as before, and the unboxing set and flow 
analysis remain consistent and consistent with the module and 
context. 

Lemma 37 

If- 

01-F u>/ 
0;ri-e :r 

C;Q^e 

ChT 

T h F not unboxed 

The 

T h r not unboxed 

hp:V 

Th E:B'{t) 

then there exists p' , E' , C', g', and T' sucii that: 
p~sp' 

E r~.^E' 

hp':F 
FhS' :B^(r) 
C';o'h{p',E'{e)) 
C hT' 

T'h(p',S'(e» 
Ibls(p') n T' = 
Ibls(S') n T' = 
T n Ibls(e) = T' n Ibls(e) 

Proof: Let V be the set of variables that occur in e. Let A be the 
set of type variables that occur in F, e, or r. Both these sets are 
finite. 

The derivation of C; g \- F, C; g \- e, and C; g\- r will for each 
type that is not a type variable require a particular type shape with 
some label on it in the cache of the label of that type, similarly for 
each box expression and box value require a box shape with some 
label of its contents in the cache. Let L be one such label for each 
such type and such box as well as (j(V)U(j(^)Ulbls(F)Ulbls(e)U 
Ibls(r). Note that L is a finite set. 

Let g' be £1 on F and A and on every other variable or type 
variable let it map to a fresh label (distinct from each other and 
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E 



I {E[t] ey I (unbox S)' 



r h : r{T) 

r h £ : {^ct.Ti -^ T2y (r) h r w/ ir (r) = r 0; T h e : r{ h n [r/a] = ri 

rh(£[r]e)':r2[T/a](r) 

rhS:box(r')^(T) 

rh (unbox S)' -.Tir) 



Ml = M2 

r h ei = 62 : r 

h (Fi ^ ei : n; 



(Vc, i : (Ml 



>* c' « Afa 



^* c")) A (Ml 



■■■ ^ M2 



;r h ei : r A0;r h 62 : r A 



(r2 



\/p,E: hp:rArh£;:B"(r) ^ (p,C(ei)) = (p,C(e2» 
Fi = r2 A ri = r2 A (Fi h ei = 62 : ri) 



Figure 12. Contextual Equivalence 
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lbls(ei[r] 62) 
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Ibls(r) U Ibls(e) 
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Figure 13. The labels in an type, expression, or environment 



from L). Define: 

C"(i) -- 



{s\se C(i) A Ibls(s) C L} ieL 
i<^L 



Claim: C"; g' h F, C"; g' h e, and C"; g' h r. The proof is by 
induction on the derivation, consider the last rule used: 

• (Variable) In this case e — x^ and C{g{x)) C C{g{x)). 
Since x £ V, g (x) — q{x) and q{x) G L. Also i £ L. 
Therefore, C" [g{x)) C C"(i), as required. Thus by the same 
rule, C"; g' \- e, as required. 

• (Fix expression) In this case e = (fix f[a\{x:T\):T2.e'y , 
qU) = i q{x) = Ibl(ri), C;gh- (Va.n ^ r2)\ C; ^ h e', 
and (V£'(a).lbl(ri) -^ lbl(e')),^C(i). Since /, a; e V and q £ 
A e'if) = f?(/), f?'(x) = ^(a;), e'(a) = Q{a), and f?(aXe L. 
By the induction hypothesis, C"; g' h (Va.ri — 5- r2)* and 
C"; g' \- e'. Since g'(a) e L, Ibl(ri) G i, Ibl(e) G i, and 
i G L, (V£i'(a).lbl(ri) -^ lbl(e'))„^C"(i). Thus by the same 
rule, C" ; g' \- e, as required. 



(Application) In this case e — (ei[r] 62)*, C; £> h ei, C; £> h 
r, C;£> h 62, and /unC(lbl(ei), lbl(r),lbl(e2),i). By the 
induction hypothesis, C'; g' h ei, C'; g' \- r, and C; g' h 62. 
Since Ibl(ei) G L, Ibl(r) G L, lbl(e2) G L, and i G L, it is 
easy to see that/MnC(lbl(ei), Ibl(r), lbl(e2), i) (for C"). Thus 
by the same rule, C" ; £>' h e, as required. 

Other cases are similar . . . 



Let A' be the set of type variables that appear in F and r. We 
construct p' and E' as alpha variants and relabellings of p and E 
as follows. Since h p : F, p contains F, so we keep that part 
the same. Type variables that are in A' we keep the same. All 
other type variables and variables we pick an alpha variant that is 
fresh (distinct from each other and from A respectively V). The 
outermost label on types on variables we relabel to the binding label 
for that variable. All other labels we relabel to be fresh. Clearly 
p ~s p' and E ~s E' . 

Claim: h p' : F and F h _E' : B-' (r) for some j. The proof is a 
straight forward induction on the structure of p' and E' . 
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Now we need to build a C such that C';q' h (p', E'(e)). We 
start from C". First we add into the caches, shapes required directly 
for the rules for C; g' h p' and C; g' h E' (such things are already 
there for e). In the case of types we add shapes using the label of 
the type as the label of the shape. In the case of box expressions 
and values we use the label of the contents of the box as the label 
of the contents of the shape. What remains is a bunch of subset and 
equalty constraints between cache entries, so we pick C' to be the 
smallest larger cache that satisfies these constraints. Clearly such a 
C' exists and by construction, C'; g' h (p', E' (e)). 

Set T' = T n L. Clearly, T n Ibls(e) = T' n Ibls(e) as 
Ibls(e) C L. By construction, the labels of p' and E' are in the 
labels of r or r or are not in L. Since T h F not unboxed and 
T h r not unboxed, the labels of F and r are not in T. Therefore, 
Ibls(F) n T' = and Ibls(r) n T' = 0. In fact, if A" is a set of 
type variables in p' and E' then g'{A") n T' = too. 

Claim: any flow from the interface to a boxC(i, j) condition has 
a box type at the interface (**), and similarly for funC{i,j, k, I). 
The proof is by induction on the flow conditions noting that in all 
cases the two end points have the same type. 

CIaim:C' h T'.Let sand i be such that s G C'(i).Ifs £ C"(i) 
then s £ C(i), i £ L, and Ibls(s) C L, and in particular, 

Ibl(s) G L. Since C h T, i ~ Ibl(s). Since T' = T n L, 

T' 

i £ L, and Ibl(s) £ L, i c::^ Ibl(s), as required. Otherwise, we 
claim that i, Ibl(s) ^ T'. Let Lc = Ibls(p') U lbls(£;') U g'{A"), 
Li = Ibls(F) U Ibls(r), Lm = L - Lc, and La = Lc U L. 
First notice that C" has entries only for labels in L and with shapes 
whose labels are in L. The first part of computing C' added shapes 
to cache entries for labels in Lc with shapes whose labels are in 
Lc- The second part of computing C' only propagates existing 
shapes from one cache entry to another, and only from/to cache 
entries in La or in labels in shapes in the cache entries. Thus, 
the cache entries of C' are only for La with shapes with labels 
in La- If Ibl(s) G Lc then by previous argument Ibl(s) ^ T', as 
required. If Ibl(s) G Lm then we will show that s G C"(j) for 
some j G -Lj. Then s G C(j), and since C h T and j ^ T, 
Ibl(s) ^ T so Ibl(s) ^ T', as required. If i e Lc then by 
previous argument i ^ T', as required. If i G Lm then we will 
show that C"(j) C C"(i) for some j G Li- Then since C"(j) 
is inhabited because it labels a type checked in C" ; g' h F or 
C"; g' \- T, and since that required type shape has labels in L, 
/ C(j) C C{i). Then since C h T and j ^ T, i ^ T, so 
i ^ T', as required. It remains to show the two conditions we 
claimed. Since C' was computed using a least fixed point, we prove 
these claims by induction on when s was added to C'(i). Consider 
the cases: 

• Case 1, s was added to i because s G C'(j), s ^ C'(i), 
and C'(j) C C(i) is required by the rules for variables, box 
expressions, frames, box values, or environments. In this case, i 
and j have to come from the same term, that is, either i,j£ Lc 
or i,j G L. If Ibl(s) G Lm then s must have been added to 
C(j) previously in the second phase of constructing C', so by 
the induction hypothesis, Ibl(s) G C"(fc) for some k G Li- 
If i G Lm then j G L. First note that the condition on j 
and i is also required to show that C"; g' h F, C"; g' h e, 
or C";g' h r, so C"(j) C C"(i). If j G Lj then we have 
what we need. Otherwise j G Lm, so s was added to C'(j) 
previously in the second phase of constructing C'{j), so by the 
induction hypothesis, C"(fc) C C"(j) for some k G Lj- Then 
C"(fc) C C"(i), as required. 

• Case 2, s was added to i because C{g'{a)) — C(i), required 
by the rule for type variables, did not hold and s G C{g'{a))- 
Similar to Case 1. 



• Case 3, s was added to i because boxG{j, i) is required, either 

(boxt i')l G C'{j) or (box i')l G C'(j), and s was already 
in C(i'). In this case, i and j have to come from the same term, 
that is, either i,j G Lc or i,j G L. Note that the labels in 
any shape under consideration come from the same term, that 
is, either they are all in L or they are all in Lc- 

• If Ibl(s) G Lm then: 

— If s was added to C'(i') previously in the second phase 
of constructing C' then by the induction hypothesis, 
s G C"(fc) for some fe G I/j. 

— Otherwise s G C"(i') and i',j' G L. Since s was not 
already in C'(i) then the box shape was added to C'(j) 
previously in the second phase of constructing C', so by 
the induction hypothesis, the box shape is in C"(fc) for 
some k £ Li-By (**), k labels a box type. By the rules 
for acceptability, boxC{k, k') for C" for some k' G Li. 
Thus, C"(i') C C"(fc'). Thus s G C"{k'), as required. 

■ If i G Lm then j £ L and boxC{j, i) holds for C". 

— If the box shape was added to C'(j) previously in the 
second phase of the constructing C' then by the induc- 
tion hypothesis, C"(fc) C C"(j) for some k £ Lj. 
By (**), k labels a box type. Then by the rules for ac- 
ceptability, (box i")l £ C"(fc) for some i" £ Li, so 

{hoxi")f £ C"(j). By boxC{j,i), C"{i") C C"(i), 
as required. 

— Otherwise, the box shape was in C"(j) and i' ,j' £ L. 
By boxC{j,i),C"{i') C C"(i). Since s was not already 
in C'(i) then it was previously added to C'(i') in the 
second phase of constructing C', so by the induction 
hypothesis, C"(fc) C C"{i') for some k £ Li- Then 
by transitivity C"(fc) C C"(i), as required. 

• Case 4, s was added because a funC{ji , j2 , J3 , J4 ) is required. 
Similar to Case 3. 

Claim: T' h {p',E'{e)). The proof is by a straight forward 
induction on the structure of (p', E'{e))- The only interesting case 
is application. In that case, we have (ei[r] 62)*. By the induction 
hypothesis we get T' h ei and T' h 62. We just need to show that 
T' h r : r. If the application came from e then since the labels 
of T are in L, the result follows from T h r : r, which holds by 
assumption (T h e). Otherwise, the labels of r are not in T' so 
clearly T' h rtrir). Then tr(T) — r holds by the typing rules. 

■ 

We these definitions we can prove that unboxing for modules is 
correct. 



Theorem 6 

If I- (F ^ e : r) ui/, C; p h (F ^ e 
T h (F ^ e : r) tiien h (F ^ e : r) = J (F = 



r), C h T, and 



Proof: By definition, J (F ^ e : r) tx = (r ^ Jetf : t). Clearly 

F = F and r = r, so it remains to show that F h e = J e t-f • ''"■ 
By h (F ^ e : r) w/, h F w/ and F h e : r. By 
C;g \- (r ^ e : t), C;g \- r, C;g \- e, and C; g h r. 
By T h (F => e : r), T h F not unboxed. The, and 
T h r not unboxed- By Theorem 1, JF|,-f h Je^-f : Jr|,-|-. By 
Lemma 34, F h J e I,-,- : r. Let p and E be such that h p : F and 
rh E ■- B*{t). Then by Lemma 37, there exists p', E', C', g', and 
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T' such that: 

E r^,E' 

hp':r 

Th E' :B*'(r) 

C';g'h{p',E'{e)) 

ChT' 

T'h{p',E'{e}) 

Ibls(p') n T' = 

ibis(£;') n T' = 

T n Ibls(e) = T' n Ibls(e) 

Since the operational semantics is deterministic, we just need to 
show that (p,E{\e[x)) matches {p,E{e)) in behaviour. There are 
two cases: 

• If (p,£;(e)) I — >* (p,c^) then by Lemma 33, {p',E'{e)) i — >* 
{p',c>') for some /. By Theorem 4, J (p', £;'(e))t-f' ' — >* 
lip'jCp )[r'- By both Lemma 35 and definition of unbox- 
ing, (p' , E' {\el^,)) I — !■* {p',c' ). Hence by Lemma 36, 
(p', iJ'(JeL-r)) I — ^* {p',c' ). Therefore by Lemma 33 again, 
(p, E{\e[-[}) I — >* (p', c' ), for some j" . It is not too hard to 
see that j — j", as required. 

• If (p,-B(e)) I — y ■■■ then by Lemma 33, {p',E'{e)) i — y 

• • • . By Theorem 4, J (p', E'{e)) [~[, i — > • • • . By Lemma 35, 
{p',E'{\eir,)) ^ ••■. By Lemma 36, (p',i5' (JeLr)) ^ 

• • • . By Lemma 33, (p, E{\elx)) > — > • • • , as required. 



of optimality explicitly does not correspond in any way to reduced 
allocation or reduced instruction count and does not seem to pro- 
vide uniform improvement over Leroy's approach. 

The MLton compiler [11] largely avoids the issue of a uniform 
object representation by completely monomorphizing programs be- 
fore compilation. This approach requires whole-program compila- 
tion. More limited monomorphization schemes could be considered 
in an incremental compilation setting. Monomorphization does not 
eliminate the need for boxing in the presence of dynamic type tests 
or reflection. Just in time compilers (e.g. for .NET) may monomor- 
phize dynamically at runtime. 

The TIL compiler [1, 10] uses intensional type analysis in a 
whole-program compiler to allow native data representations with- 
out committing to whole-program compilation. As with the Leroy 
coercion approach, polymorphic uses of objects require condition- 
als and boxing coercions to be inserted at use sites, and conse- 
quently there is the potential to slow down, rather than speed up, 
the program. 

Serrano and Feeley [9] described a flow analysis for perform- 
ing unboxing substantially similar in spirit to our approach. Their 
algorithm attempts to find a monomorphic typing for a program 
in which object representations have not been made explicit, which 
they then use selectively to choose whether to use a uniform or non- 
uniform representation for each particular object. Their approach 
differs in that they define a dedicated analysis rather than using a 
generic reaching definitions analysis. They assume a conservative 
garbage collector and hence do not need to account for the require- 
ments of GC safety, and they do not prove a correctness result. 



7. Related work 

This paper provides a modular approach to showing correctness 
of a realistic compiler optimization that rewrites the structure of 
program data structures in significant ways. Our approach uses an 
arbitrary inter-procedural reaching definitions analysis to eliminate 
unnecessary heap allocation in an intermediate representation in 
which object representation has been made explicit. Our optimiza- 
tion can be staged freely with other optimizations. Unlike any pre- 
vious work that we are aware of, we account for correctness with 
respect to the meta-data requirements of the garbage collector. For 
presentational purposes, we have restricted our attention to the core 
concern of GC safety, but additional issues such as value size, dy- 
namic type tests, etc. are straightforward to incorporate. 

There has been substantial previous work addressing the prob- 
lem of unboxing. Peyton Jones [3] introduced an explicit distinction 
between boxed and unboxed objects to provide a linguistic account 
of unboxing, and hence to allow a high-level compiler to locally 
eliminate unboxes of syntactically apparent box introduction op- 
erations. Leroy [4] defined a type-driven approach to adding coer- 
cions into and out of specialized representations. The type driven 
translation represented monomorphic objects natively (unboxed, in 
our terminology), and then introduced wrappers to coerce polymor- 
phic uses into an appropriate form. To a first-order approximation, 
instead of boxing at definition sites this approach boxes objects 
at polymorphic use sites. This style of approach has the problem 
that it is not necessarily beneficial, since allocation is introduced 
in places where it would not otherwise be present. This is reflected 
in the slowdowns observed on some benchmarks described in the 
original paper. This approach also has the potential to introduce 
space leaks. In a later paper [5] Leroy argued that a simple untyped 
approach gives better and more predictable results. 

Henglein and J0rgensen [2] defined a formal notion of optimal- 
ity for local unboxings and gave two different choices of coercion 
placements that satisfy their notion of optimality. Their definition 
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